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METHODS OF GENERATING AND SCREENING FOR PROTEASES 

WITH ALTERED SPECIFICITY 

RELATED APPLICATIONS 

This application claims priority from U.S.S.N. 60/415,388, filed October 2, 2002 
5 which is incorporated by reference in its entirety. 

BACKGROUND OF THE INVENTION 

Enzymes are used within a wide range of applications. An important group of 
enzymes is the proteases, which cleave proteins. Many proteases cleave target proteins 
specifically at defined substrate sequences. This tendency for specific cleavage by proteases 

10 is referred to as substrate, or substrate sequence, specificity. Substrate specificities associated 
with different members of the diverse families of proteolytic enzymes can be attributed, in 
part, to different sets of amino acids within the binding domain, that are utilized by each 
enzyme family for substrate recognition and catalysis. A rational approach to engineering 
mutant enzymes has been successful for several proteases. A conserved amino acid residue 

15 (glycine 166), known from crystal lographic data to reside within the binding cleft, of 

subtilisin was changed to one of several different amino acid residues. The resulting enzyme 
derivatives showed dramatic changes in specificity toward substrates with increasing 
hydrophobic ity and amino acid size (Wells, et ah, Cold Spring Harb. Symp. Quant. Biol., 
(1987) 52: 647-52.). Another bacterially encoded serine endopeptidase, a-lytic protease, has 

20 also been rationally altered by changing methionine 192 to an alanine. The resulting 

mutation within the active site of the enzyme appears to have increased structural flexibility 
of the enzyme active site. The resulting a-lytic protease derivative has a broader substrate 
specificity towards larger, more hydrophobic targets (Bone, et aL, Biochemistry, 1991, (43) 
:10388-98). 
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The serine proteases are an extensively studied family of related endopeptidases, 
characterized by their so-called catalytic triad: Asp His Ser. Within a family of similar 
proteins, the regions of conserved primary, secondary and tertiary structure tend to include 
the residues involved in the active site(s), as well as other residues important to activity. For 
example, the members of the catalytic triad are far apart in the primary structures (amino acid 
sequence) of serine proteases, but these residues are brought to within bond forming distance 
by the tertiary structure (or folding), of these proteins. 

Serine proteases differ markedly in substrate sequence recognition properties: some 
are highly specific (i.e., the proteases involved in blood coagulation and the immune 
complement system); some are only partially specific (i.e., the mammalian digestive 
proteases trypsin and chymotrypsin); and others, like subtilisin, a bacterial protease, are 
completely non-specific. Despite these differences in specificity, the catalytic mechanism of 
serine proteases is well conserved, consisting of a substrate sequence-binding site that 
correctly positions the scissile peptide in the active site with five hydrogen-bonds. Once the 
peptide is bound, a hydrogen-bond network between the three invariant residues of the 
catalytic triad catalyzes the hydrolysis of the peptide bond. This large family of proteases can 
be broadly divergent in their sequence specificities despite being highly conserved in their 
mechanism of catalysis. 



SUMMARY OF THE INVENTION 

The present invention is drawn to the generation and screening of proteases that 
cleave proteins known to be involved in disease. The resultant proteins may be used as 
agents for in vivo therapy. 

The invention is broadly drawn to the modification of proteases to alter their substrate 
sequence specificity, so that the modified proteases cleave a target protein which is involved 
with or causes a pathology. In one embodiment of the invention, this modified protease is a 
serine protease. In another embodiment of the invention, this modified protease is a cysteine 
protease. 

One embodiment of the invention involves generating a library of protease sequences 
to be used to screen for modified proteases that cleave a desired target protein at a desired 
substrate sequence. In one aspect of this embodiment, each member of the library is a 
protease scaffold with a number of mutations made to each member of the library. A 



protease scaffold has the same or a similar sequence to a known protease. In one 
embodiment, this scaffold is a serine protease. In another embodiment of the invention, this 
scaffold is a cysteine protease. The cleavage activity of each member of the library is 
measured using the desired substrate sequence from the desired target protein. As a result, 
5 proteases with the highest cleavage activity with regard to the desired substrate sequence are 
detected. 

In another aspect of this embodiment, the number of mutations made to the protease 
scaffold is 1, 2-5 {e.g. 2, 3, 4 or 5), 5-10 {e.g. 5, 6, 7, 8, 9 or 10), or 10-20 {e.g. 10, 11, 12, 13, 
14, 15, 16, 17, 18, 19 or 20). 

10 In another aspect of this embodiment, the known protease scaffold can include the 

amino acid sequence of trypsin, chymotrypsin, substilisin, thrombin, plasmin, Factor Xa, 
urokinase type plasminogen activator (uPA), tissue plasminogen activator (tPA), membrane 
type serine protease- 1 (MTSP-1), granzyme A, granzyme B, granzyme M, elastase, chymase, 
papain, neutrophil elastase, plasma kallikrein, urokinase type plasminogen activator, 

15 complement factor serine proteases, ADAMTS13, neural endopeptidase/neprilysin, furin, or 
cruzain. 

In another aspect of this embodiment, the target protein is involved with a pathology, 
for example, the target protein causes the pathology. The pathology can be e.g., rheumatoid 
arthritis, sepsis, cancer, acquired immunodeficiency syndrome, respiratory tract infections, 
20 influenza, cardiovascular disease, or asthma. 

In another aspect of this embodiment, the activity of the detected protease is increased 
by at least 10-fold, 100-fold, or 1000-fold over the average activity of the library. 

Another embodiment of the invention, involves generating a library of substrate 
sequences to be used to screen a modified protease in order to detect which substrate 
25 sequence(s) the modified protease cleaves most efficiently. The members of the library are 
made up of randomized amino acid sequences, and the cleavage activity of each member of 
the library by the protease is measured. Substrate sequences which are cleaved most 
efficiently by the protease are detected. 

In another aspect of this embodiment, the substrate sequence in a library is 4, 5, 6, 7, 
30 8,9, 10, 11, 11, 12, 13, 14, 15, 16, 17, 18, 1 9 or 20 amino acids long. 

In another aspect of this embodiment, the substrate sequence is a part of a target 
protein. This target protein can be involved in a pathology. For example, the target protein is 
one which a causes a pathology. In another aspect of this embodiment, this pathology is 
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rheumatoid arthritis, sepsis, cancer, acquired immunodeficiency syndrome, respiratory tract 
infections, influenza, cardiovascular disease, or asthma. 

In another aspect of this embodiment, the efficiency of cleavage of the detected 
substrate sequence is increased by at least 10-fold, at least 100-fold, or at least 1000-fold over 
5 the average activity of the library. 

In yet another embodiment, the invention includes a method for treating a patient 
having a pathology. The method involves administering to the patient a protease that cleaves 
a target protein involved with the pathology, so that cleaving the protein treats the pathology. 

In one aspect of this embodiment, the pathology can be rheumatoid arthritis, sepsis, 
10 cancer, acquired immunodeficiency syndrome, respiratory tract infections, influenza, 

cardiovascular disease, or asthma. In another aspect of this embodiment, the protease can be 
a serine protease. In another embodiment of the invention, this modified protease is a 
cysteine protease. In another aspect of this embodiment, the target protein causes the 
pathology. 

15 The patient having a pathology, e.g. the patient treated by the methods of this 

invention can be a mammal, or more particularly, a human. 

In another aspect of the embodiment the target protein can be tumor necrosis factor, 
tumor necrosis factor receptor, interleukin-1, interleukin-1 receptor, interleukin-2, 
interleukin-2 receptor, interleukin-4, interleukin-4 receptor, interleukin-5, interleukin-5 

20 receptor, interleukin-1 2, interleukin-1 2 receptor, interleukin-1 3, interleukin-1 3 receptor, p- 
selectin, p-selectin glycoprotein ligand, Substance P, the Bradykinins, PSGL, factor IX, 
immunoglobulin E, immunoglobulin E receptor, CCR5, CXCR4, glycoprotein 120, 
glycoprotein 41, CD4, hemaglutinin, respiratory syncytium virus fusion protein, B7, CD28, 
CD2, CD3, CD4, CD40, vascular endothelial growth factor, VEGF receptor, fibroblast 

25 growth factor, endothelial growth factor, EGF receptor, TGF receptor, transforming growth 
factor, Her2, CCR1, CXCR3, CCR2, Src, Akt, Bcl-2, BCR-Abl, glucagon synthase kinase-3, 
cyclin dependent kinase-2 (cdk-2), or cyclin dependent kinase-4 (cdk-4). 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 

30 belongs. Although methods and materials similar or equivalent to those described herein can 
be used in the practice or testing of the present invention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references 
mentioned herein are incorporated by reference in their entirety. In the case of conflict, the 
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present specification, including definitions, will control. In addition, the materials, methods, 
and examples are illustrative only and are not intended to be limiting. 

Other features and advantages of the invention will be apparent from the following 
detailed description and claims. 

5 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic of the amino acid sequence of caspase-3, and a table 
containing the amino acid sequence of caspase 3 (SEQ ID NO: 1). 

Figure 2 is a diagram from X-ray crystallography showing the structure of caspase-3 
10 focusing on the inactivation sequence cleaved by I99A/N21 8 A granzyme B, along with a 
diagram showing the amino acid sequence of the inactivation sequence site of the enzyme 
(SEQ ID NO:2), 

Figure 3 A shows a series of bar graphs showing the substrate specificity of wild-type 
granzyme B versus the substrate specificity of I99A/N219A granzyme B at P2, P3 and P4. 
15 Figure 3B shows a table containing kinetic data derived from the graphs in Figure 3 A. 

Figure 4 shows a series of graphs from MALDI mass spectrometry of a peptide 
corresponding to the inactivation sequence in caspase-3 in the presence of wild-type and 
I99A/N219A granzyme B. 

Figure 5 depicts a SDS PAGE gel showing bands for cleavage products of caspase-3 
20 small subunit by wild-type and I99A/N219A granzyme B. 

Figure 6A shows a graph plotting caspase-3 activity over time in the presence of wild- 
type granzyme B and I99A/N219A granzyme B. Figure 6B shows a bar graph showing the 
Vmax of the activity of caspase-3 in the presence of wild-type and I99A/N219A granzyme B. 

Figure 7A shows a bar graph plotting apoptosis in cell lysates as measured by 
25 caspase-3 activity in the presence of I99A/N21 8A granzyme B. Figure 7B shows a bar graph 
plotting apoptosis in cell lysates as measured by caspase-3 activity in the presence of 
wildtype and increasing concentrations of I99A/N218A granzyme B. 

DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the invention in detail, certain terms used herein will be defined. 
30 The term "allelic variant" denotes any of two or more alternative forms of a gene 

occupying the same chromosomal locus. Allelic variation arises naturally through mutation, 
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and may result in phenotypic polymorphism within populations. Gene mutations can be 
silent (no change in the encoded polypeptide) or may encode polypeptides having altered 
amino acid sequence. The term "allelic variant" is also used herein to denote a protein 
encoded by an allelic variant of a gene. 
5 The term "complements of polynucleotide molecules" denotes polynucleotide 

molecules having a complementary base sequence and reverse orientation as compared to a 
reference sequence. For example, the sequence 5' ATGCACGGG 3 1 is complementary to 5 1 
CCCGTGCAT 3'. 

The term "degenerate nucleotide sequence" denotes a sequence of nucleotides that 

10 includes one or more degenerate codons (as compared to a reference polynucleotide molecule 
that encodes a polypeptide). Degenerate codons contain different triplets of nucleotides, but 
encode the same amino acid residue (i.e., GAU and GAC triplets each encode Asp). 

A "DNA construct" is a single or double stranded, linear or circular DNA molecule 
that comprises segments of DNA combined and juxtaposed in a manner not found in nature. 

15 DNA constructs exist as a result of human manipulation, and include clones and other copies 
of manipulated molecules. 

A "DNA segment" is a portion of a larger DNA molecule having specified attributes. 
For example, a DNA segment encoding a specified polypeptide is a portion of a longer DNA 
molecule, such as a plasmid or plasmid fragment, that, when read from the 5' to the 3 f 

20 direction, encodes the sequence of amino acids of the specified polypeptide. 

The term "expression vector" denotes a DNA construct that comprises a segment 
encoding a polypeptide of interest operably linked to additional segments that provide for its 
transcription in a host cell. Such additional segments may include promoter and terminator 
sequences, and may optionally include one or more origins of replication, one or more 

25 selectable markers, an enhancer, a polyadenylation signal, and the like. Expression vectors 
are generally derived from plasmid or viral DNA, or may contain elements of both. 

The term "isolated", when applied to a polynucleotide molecule, denotes that the 
polynucleotide has been removed from its natural genetic milieu and is thus free of other 
extraneous or unwanted coding sequences, and is in a form suitable for use within genetically 

30 engineered protein production systems. Such isolated molecules are those that are separated 
from their natural environment and include cDNA and genomic clones, as well as synthetic 
polynucleotides. Isolated DNA molecules of the present invention may include naturally 
occurring 5' and 3' untranslated regions such as promoters and terminators. The identification 
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of associated regions will be evident to one of ordinary skill in the art (see for example, 
Dynan and Tijan, Nature 316:774-78, 1985). When applied to a protein, the term "isolated" 
indicates that the protein is found in a condition other than its native environment, such as 
apart from blood and animal tissue. In a preferred form, the isolated protein is substantially 
5 free of other proteins, particularly other proteins of animal origin. It is preferred to provide 
the protein in a highly purified form, Le. 9 at least 90% pure, preferably greater than 95% pure, 
more preferably greater than 99% pure. 

The term "operably linked", when referring to DNA segments, denotes that the 
segments are arranged so that they function in concert for their intended purposes, e.g. 
10 transcription initiates in the promoter and proceeds through the coding segment to the 
terminator. 

The term "ortholog" denotes a polypeptide or protein obtained from one species that is 
the functional counterpart of a polypeptide or protein from a different species. Sequence 
differences among orthologs are the result of speciation. 

1 5 The term "polynucleotide" denotes a single- or double-stranded polymer of 

deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3* end. Polynucleotides 
include RNA and DNA, and may be isolated from natural sources, synthesized in vitro, or 
prepared from a combination of natural and synthetic molecules. The length of a 
polynucleotide molecule is given herein in terms of nucleotides (abbreviated "nt") or base 

20 pairs (abbreviated "bp"). The term "nucleotides" is used for both single- and double-stranded 
molecules where the context permits. When the term is applied to double-stranded molecules 
it is used to denote overall length and will be understood to be equivalent to the term "base 
pairs". It will be recognized by those skilled in the art that the two strands of a double- 
stranded polynucleotide may differ slightly in length and that the ends thereof may be 

25 staggered as a result of enzymatic cleavage; thus all nucleotides within a double-stranded 

polynucleotide molecule may not be paired. Such unpaired ends will, in general, not exceed 
20 nt in length. 

The term "promoter" denotes a portion of a gene containing DNA sequences that 
provide for the binding of RNA polymerase and initiation of transcription. Promoter 
30 sequences are commonly, but not always, found in the 5' non-coding regions of genes. 

A "protease" is an enzyme that cleaves peptide bonds in proteins. A "protease 
precursor" is a relatively inactive form of the enzyme that commonly becomes activated upon 
cleavage by another protease. 
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The term "secretory signal sequence" denotes a DNA sequence that encodes a 
polypeptide (a "secretory peptide") that, as a component of a larger polypeptide, directs the 
larger polypeptide through a secretory pathway of a cell in which it is synthesized. The 
larger polypeptide is commonly cleaved to remove the secretory peptide during transit 
5 through the secretory pathway. 

The term "substrate sequence" denotes a sequence that is specifically targeted for 
cleavage by a protease. 

The term "target protein" denotes a protein that is specifically cleaved at its substrate 
sequence by a protease. 
10 The terms "S1-S4" refer to the residues in a protease that make up the substrate 

sequence binding pocket. They are numbered sequentially from the recognition site N- 
terminal to the site of proteolysis- the scissile bond. 

The terms "P1-P4" and "PI '-P4'" refer to the residues in a peptide to be cleaved that 
specifically interact with the S1-S4 residues found above. P1-P4 generally comprise the 
15 substrate sequence. P1-P4 are the positions on the N-terminal side of the cleavage site, 

whereas PI '-P4' are the positions to the C-terminal side of the cleavage site. {See Figure 2). 

The term "scaffold" refer to an existing protease to which various mutations are made. 
Generally, these mutations change the specificity and activity of the scaffold. 

An "isolated" or "purified" polypeptide or protein or biologically-active portion 
20 thereof is substantially free of cellular material or other contaminating proteins from the cell 
or tissue source from which the protease protein is derived, or substantially free from 
chemical precursors or other chemicals when chemically synthesized. The language 
"substantially free of cellular material" includes preparations of protease proteins in which 
the protein is separated from cellular components of the cells from which it is isolated or 
25 recombinantly-produced. In one embodiment, the language "substantially free of cellular 
material" includes preparations of protease proteins having less than about 30% (by dry 
weight) of non-protease proteins (also referred to herein as a "contaminating protein"), more 
preferably less than about 20% of non-protease proteins, still more preferably less than about 
10% of non-protease proteins, and most preferably less than about 5% of non-protease 
30 proteins. When the protease protein or biologically-active portion thereof is 

recombinantly-produced, it is also preferably substantially free of culture medium, i.e., 
culture medium represents less than about 20%, more preferably less than about 10%, and 
most preferably less than about 5% of the volume of the protease protein preparation. 
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The language "substantially free of chemical precursors or other chemicals" includes 
preparations of protease proteins in which the protein is separated from chemical precursors 
or other chemicals that are involved in the synthesis of the protein. In one embodiment, the 
language "substantially free of chemical precursors or other chemicals" includes preparations 
5 of protease proteins having less than about 30% (by dry weight) of chemical precursors or 
non-protease chemicals, more preferably less than about 20% chemical precursors or 
non-protease chemicals, still more preferably less than about 10% chemical precursors or 
non-protease chemicals, and most preferably less than about 5% chemical precursors or 
non-protease chemicals. 

10 The present invention is drawn to methods for generating and screening proteases to 

cleave target proteins at a given substrate sequence. Proteases are protein-degrading enzymes 
that recognize an amino acid or an amino acid substrate sequence within a target protein. 
Upon recognition of the substrate sequence, proteases catalyze the hydrolysis or cleavage of a 
peptide bond within a target protein. Such hydrolysis of the target protein may inactivate it, 

1 5 depending on the location of peptide bond within the context of the full-length sequence of 
the target sequence. The specificity of proteases can be altered through protein engineering. 
If a protease is engineered to recognize a substrate sequence within a target protein or 
proteins that would (L) alter the function i.e. by inactivation of the target protein(s) upon 
catalysis of peptide bond hydrolysis and, (it) the target protein(s) are recognized or 

20 unrecognized as points of molecular intervention for a particular disease or diseases, then the 
engineered protease has a therapeutic effect via a proteolysis-mediated inactivation event. In 
particular, proteases can be engineered to cleave receptors between their transmembrane and 
cytokine binding domains. The stalk regions which function to tether protein receptors to the 
surface of a cell or loop regions are thereby disconnected from the globular domains in a 

25 polypeptide chain. 

In one embodiment, the target protein to be cleaved is involved with a pathology, 
where cleaving the target protein at a given substrate sequence serves as a treatment for the 
pathology. 

In one embodiment, the protease cleaves a protein involved with rheumatoid arthritis. 
30 For example, the protease cleaves the TNF receptor between the transmembrane domain and 
the cytokine binding domain. This cleavage can inactivate the receptor. Rheumatoid arthritis 
is thereby treated by inhibiting the action of tumor necrosis factor (TNF). 
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The protease cleaves the same targets as activated protein C. This cleavage can 
attenuate the blood coagulation cascade. Sepsis is thereby treated by supplementing the 
action of protein C. 

The protease cleaves cell surface molecules that are responsible for tumorigenicity, 
5 preventing the spread of cancer. For example, cleavage of cell surface molecules can 

inactivate their ability to transmit extracellular signals, especially cell proliferation signals. 
Without these signals, cancer cells often cannot proliferate. The protease of the invention 
could therefore, be used to treat cancer. In another aspect of this embodiment, the protease 
could cleave any target protein that is responsible for the spread of cancer. Cleaving a target 
10 protein involved in cell cycle progression could inactivate the ability of the protein to allow 
the cell cycle to go forward. Without the progression of the cell cycle, cancer cells could not 
proliferate. Therefore, the proteases of the invention could be used to treat cancer. 

In another embodiment of the invention, the protease cleaves membrane fusion 
proteins found on human immunodeficiency virus (HIV), Respiratory Syncytial Virus (RSV), 
15 or influenza, inhibiting these virus' ability to infect cells. Without these membrane proteins, 
these viruses would not be able to infect cells. Therefore, the protease could be used to treat 
or prevent infection by HIV, RSVm or influenza. 

In another embodiment of the invention, the protease cleaves the same target protein 
as plasminogen activator. By cleaving the target of plasminogen activator, the thrombolytic 
20 cascade is activated. In the case of a stroke or heart attack caused by a blood clot, the 
protease can be used as a treatment for cardiovascular disease. 

In another embodiment of the invention, the protease cleaves cytokines or receptors 
that are involved in inflammation as a treatment for asthma or other pathologies associated 
with inflammation. By cleaving the cytokine or receptors, the protease can inactivate the 
25 signaling cascade involved with many inflammatory processes. The protease can thereby be 
used to treat inflammation and related pathologies. 

In another embodiment of the invention, the protease cleaves signaling molecules that 
are involved in various signal cascades, including the signaling cascade responsible for the 
regulation of apoptosis. For example, the protease cleaves a caspase. This caspase can be, 
30 for example, caspase-3. By cleaving a protein involved in a signal cascade, the protease can 
be used to inactivate or modulate the signal cascade. 
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In some examples, the engineered protease is designed to cleave any of the target 
proteins in Table 1 , thereby inactivating the activity of the protein. The protease can be used 
to treat a pathology associated with that protein, by inactivating one of the target proteins. 
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Table 1 



Target 


Indication 


Molecule class 


IL-5/IL-5R 


Asthma 


Cytokine 


1L-1/IL-1R 


Asthma, inflammation, 


Cytokine 




Rheumatic disorders 




IL-13/IL-13R 


Asthma 


Cytokine 


IL-12/IL-12R 


Immunological disorders 


Cytokine 


IL-4/IL-4R 


Asthma 


Cytokine 


TNF/TNFR 


Asthma, Crohn's disease, 


Cytokine 




HIV infection, inflammation, 






psoriasis, rheumatoid 






arthritis 




CCR5/CXCR4 


HIV infection 


GPCR 


gp120/gp41 


HIV infection 


Fusion protein 

• 


CD4 


HIV infection 


Receptor 


Hemaglutinin 


Influenza infection 


Fusion protein 


RSV fusion protein 


RSV infection 


Fusion protein 


B7/CD28 


Graft-v.-host disorder, 


Receptor 




rheumatoid arthritis, 






transplant rejection, diabetes 






mellitus 




IgE/lgER 


Graft-v.-host disorder, 


Receptor 




transplant rejection 




CD2,CD3,CD4,CD40 


Graft-v.-host disorder, 


Receptor 




transplant rejection, psoriasis, 




IL-2/IL-2R 


Autoimmune disorders, 


Cytokine 




graft-v.-host disorder, 






rheumatoid arthritis 




VEGF,FGF,EGF,TGF 


Cancer 


Cytokine 


Her2/neu 


Cancer 


Receptor 


CCR1 


Multiple sclerosis 


GPCR 


CXCR3 


Multiple sclerosis, rheumatoid 


GPCR 




arthritis 




CCR2 


Atherosclerosis, rheumatoid 


GPCR 




arthritis 




Src 


Cancer, osteoporosis 


Kinase 


Akt 


Cancer 


Kinase 


Bcl-2 


Cancer 


Protein-protein 


BCR-Abl 


Cancer 


Kinase 


GSK-3 


Diabetes 


Kinase 


cdk-2/cdk-4 


Cancer 


Kinase 



The protease scaffolds are also any of the proteins disclosed below in Table 2. 



Table 2 



Code 


Name 


Gene 


Link 


Locus 


SOI . 010 


granzyme B, 
human- type 


GZMB 


3002 


14qll .2 


SOI . Oil 


testisin 


PRSS21 


10942 


16pl3 .3 


SOI . 015 


tryptase beta 1 
(Homo sapiens) 
(III) 


TPSB1 


7177 


16pl3 .3 


SOI . 017 


kallikrein hK5 


KLK5 


25818 


19ql3 .3- 
ql3 .4 


SOI . 019 


corin 




10699 


4pl3-pl2 
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f\ 1 il\ 1 zi 


ft J O *± i7 


1 Qrrl 1 O 

j . j - 
ql3.4 


SOI . 021 


DESC1 protease 




28983 


4ql3 .3 


oui . u z o 


U ryp Case yalUlua 

1 






loplJ . J 


SOI .029 


kallikrein hK14 


KLK14 


43847 


19ql3 .3- 
ql3 . 4 




nyaiuronan- 
binding serine 
protease (HGF 
act. lvacor - ± i ice 
protein) 


u a om 


t n O £T 


10q25 . 3 


SOI . 034 


transmembrane 
protease , 
serine 4 


TMPRSS4 


56649 


llq23 .3 


SOI . 054 


tryptase delta 
l ( Homo 
sapiens) 


TPSD1 


23430 


16pl3 . 3 


SOI . 074 


marapsin 




83886 


16pl3 .3 


S01 . 075 


tryptase 
homologue 2 
(Homo sapiens; 




260429 




S01 . 076 


tryptase 
homologue 3 
(Homo sapiens) 








SO 1.077 


tryptase 
chromosome 21 
(Homo sapiens) 






2 lq 


S01 . 079 


transmembrane 
protease, 
serine j 


TMPRSS3 


64699 


2 lq22 . 3 


S01 . 081 


kallikrein hK15 
(Homo sapiens) 




55554 


19ql3 .41 


S01 . 085 


Mername -AA03 1 
peptidase 
(deduced from 
ESTs by MEROPS) 








S01 . 087 


membrane - type 

• * 

mosaic serine 
protease 




84000 


llq23 


cm n q q 

bUl . U O O 


ixiername -aauj o 
peptidase 








S01 .098 


Mername -AA1 2 8 
peptidase 
(deduced from 

UCTa Kir MT?Df^DCM 
CiOlS Dy JYlliK.Vjr' o / 








COT 1 O *7 


cat ioni c 
trypsin (Homo 

c?^0"l <=» Tl cj — t~ \/T") ^ 

(1 (cationic) ) 


nn c C 1 


3 o 4 4 


/qjD 


S01. 131 


neutrophil 
elastase 


ELA2 


1991 


19pl3 .3 


S01. 132 


mannan- binding 
lect in- 
associated 
serine 
protease-3 








S01 . 133 


cathepsin G 


CTSG 


1511 


14qll .2 


S01. 134 


myeloblastin 
(proteinase 3) 


PRTN3 


5657 


19pl3 .3 
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SOI . 135 


granzyme A 


GZMA 


3 001 


5all-al2 


SOI . 139 


granzyme M 


GZMM 


3004 


19pl3 .3 


SOI . 140 


chymase ( human - 
tvDe ) 


CMA1 


1215 


14qll .2 


SOI . 143 


tryptase alpha 
(1) 


TPS1 


7176 


16pl3 . 3 


5501 146 

<J \J X • -L T U 


nT"3 n 7 "vm TC 
V— j j_ an y i l \~ iv 




JUUJ 




°m 147 


fiT"an7VTnp T-T 




oqqq 

^ -7 -7 J7 


1 A rrl 1 *5 


O w JL ■ J. J ^ 


uiiy iiiuLi y uoxii is 


^ — X I\D X 




V_£^. O.J 


SOI . 153 


pancreatic 

pi SCfaqp 

_L O l_ CI o ^ 


ELA1 


1990 


12ql3 


SOI . 154 


pancreatic 

priHonpnt* "i c; q 

CUUVJpC^/L IUuOC Hi 

(A) 




10136 


lp36 . 12 


O u 1 • _L -J _> 


elastase TT 
(IIA) 




6 7 m 6 
D j u j o 


X z 


SOI 156 


pnf e*T~OT>F k T*»t" 1 Ha RP 
cii u. w -i_ \s \Z- c jl vj. a. o c: 




R 6 R 1 

ZJ \j Zj X 


9 1 rr9 1 


SOI . 157 


chymotrypsin C 




11330 


1 


SOI . 159 


prostasin 


PRSS8 


5652 


16pll .2 


O \J X . IDU 
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Engineering proteases. 

Virtually every aspect of a protease can be re-engineered, including the enzyme 
substrate sequence specificity, thermostability, pH profile, catalytic efficiency, oxidative 
stability, and catalytic function. 

Existing proteases are used as scaffolds which include various mutations which 
change their substrate specificity. Scaffolds can largely include the amino acid sequences of 
trypsin, chymotrypsin, substilisin, thrombin, plasmin, Factor Xa, urokinase type plasminogen 
activator (uPA), tissue plasminogen activator (tPA), granzyme B, elastase, papain, cruzain, 
membrane type serine protease- 1 (MTSP-1), chymase, neutrophil elastase, granzyme A, 
plasma kallikrein, granzyme M, complement factor serine proteases, ADAMTS13, neural 
endopeptidase/neprilysin, and furin or combinations thereof. Preferred scaffolds include 
granzyme B, MTSP-1, chymase, neutrophil elastase, granzyme A, plasma kallikrein, 
urokinase type plasminogen activator, granzyme M, chymotrypsin, thrombin, complement 
factor serine proteases, ADAMTS13, neural endopeptidase/neprilysin, furin, and plasmin. 
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Determinants of substrate sequence specificity in serine proteases come from the S 1 -S4 
positions in the active site, where the protease is in contact with the P1-P4 residues of the 
peptide substrate sequence. In some cases, there is little (if any) interaction between the Sl- 
S4 pockets of the active site, such that each pocket appears to recognize and bind the 
5 corresponding residue on the peptide substrate sequence independent of the other pockets. 
Thus the specificity determinants may be generally changed in one pocket without affecting 
the specificity of the other pockets. 

For example, a protease with low specificity for a residue at a particular binding site 
or for a particular sequence is altered in its specificity by making point mutations in the 

10 substrate sequence binding pocket. In some cases, the resulting mutant has a greater than 10- 
fold increase in specificity at a site or for a particular sequence than does wild-type. In 
another embodiment, the resulting mutant has a greater than 100-fold increase in specificity 
at a site or for a particular sequence than does wild-type. In another embodiment, the 
resulting mutant has an over 1000-fold increase in specificity at a site or for a particular 

1 5 sequence than does wild-type. 

Also contemplated by the invention are libraries of scaffolds with various mutations 
that are generated and screened using methods known in the art and those detailed below. 
Libraries are screened to ascertain the substrate sequence specificity of the members. 
Libraries of scaffolds are tested for specificity by exposing the members to substrate peptide 

20 sequences. The member with the mutations that allow it to cleave the substrate sequence is 
identified. The library is constructed with enough variety of mutation in the scaffolds that 
any substrate peptide sequence is cleaved by a member of the library. Thus, proteases 
specific for any target protein can be generated. 

25 The Process 

i. Choosing A Scaffold 

In another embodiment of the invention, scaffold proteases are chosen using the 
following requirements: 1) The protease is a human or mammalian protease of known 
sequence; 2) the protease can be manipulated through current molecular biology techniques; 

30 3) the protease can be expressed heterologously at relatively high levels in a suitable host; 
and 4) the protease can be purified to a chemically competent form at levels sufficient for 
screening. In other embodiments of the invention, the scaffold protease to be mutated cleaves 
a protein that is found extracellularly. This extracellular protein is, for example, a receptor, a 
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signaling protein, or a cytokine. The residues that, upon mutation, affect the activity and 
specificity of two families of scaffold proteases are described here. Preferably, there is three 
dimensional structural information for the protease is available. Also, it is preferred that 
there be knowledge of the initial substrate specificity of the protease. It is also preferable that 
5 the protease be active and stable in vitro and that knowledge of macromolecular modulators 
of the protease are available. Also, proteases are preferred which cleave targets that are 
relevant to affecting pathology, e.g. inactivating protein effectors of pathology. 
Serine Proteases. 

In another embodiment of the invention, serine proteases with altered specificity are 

10 generated by a structure-based design approach. Each protease has a series of amino acids 
that line the active site pocket and make direct contact with the substrate. Throughout the 
chymotrypsin family, the backbone interaction between the substrate and enzyme are 
completely conserved, but the side chain interactions vary considerably throughout the 
family. The identity of the amino acids that comprise the S1-S4 pockets of the active site 

1 5 determines the substrate specificity of that particular pocket. Grafting the amino acids of one 
serine protease to another of the same fold modifies the specificity of one to the other. For 
example, a mutation at position 99 in the S2 pocket to a small amino acid confers a 
preference for large hydrophobic residues in the P2 substrate position. Using this process of 
selective mutagenesis, followed by substrate library screening, proteases are designed with 

20 novel substrate specificities towards proteins involved with various diseases. 

The serine proteases are members of the same family as chymotrypsin, in that they 
share sequence and structural homology with chymotrypsin. The active site residues are 
Asp 102, His 57, and Ser 195. The linear amino acid sequence can be aligned with that of 
chymotrypsin and numbered according to the P sheets of chymotrypsin. Insertions and 

25 deletions occur in the loops between the beta sheets, but throughout the structural family, the 
core sheets are conserved. The serine proteases interact with a substrate in a conserved 
(3 sheet manner. Up to 6 conserved hydrogen bonds can occur between the substrate and the 
enzyme. 

Cysteine Proteases 

30 Papain-like cysteine proteases are a family of thiol dependent endo-peptidases related 

by structural similarity to papain. They form a two-domain protein with the domains labeled 
R and L (for right and left) and loops from both domains form a substrate recognition cleft. 
They have a catalytic triad made up of the amino acids Cys 25, His 159 and Asn 175. Unlike 
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serine proteases (which recognize and proteolyze a target peptide based on a P-sheet 
conformation of the substrate), this family of proteases does not have well-defined pockets 
for substrate recognition. The main substrate recognition occurs at the P2 amino acid, 
(compared to the PI residue in serine proteases). 
5 The S2 pocket is the most selective and best characterized of the protease substrate 

recognition sites. It is defined by the amino acids at the following spatial positions (papain 
numbering): 66, 67, 68, 133, 157, 160 and 205. Position 205 plays a role similar to position 
189 in the serine proteases- a residue buried at the bottom of the pocket that determines the 
specificity. 

10 The substrate specificity of a number of cysteine proteases (human cathepsin L, V, K, 

S, F, B, papain, and cruzain) were profiled using a complete diverse positional scanning 
synthetic combinatorial library (PS-SCL). The complete library consists of PI, P2, P3, and 
P4 tetrapeptide substrates in which one position is held fixed while the other three positions 
are randomized with equal molar mixtures of the 20 possible amino acids, giving a total 

1 5 diversity of —1 60,000 tetrapeptide sequences. 

Overall, PI specificity was almost identical between the cathepsins, with Arg and Lys 
being strongly favored while small aliphatic amino acids were tolerated. Much of the 
selectivity was found in the P2 position, where the human cathepsins were strictly selective 
for hydrophobic amino acids. Interestingly, P2 specificity for hydrophobic residues was 

20 divided between aromatic amino acids such as Phe, Tyr, and Trp (cathepsin L, V), and bulky 
aliphatic amino acids such as Val or Leu (cathepsin K, S, F). Compared to the P2 position, 
selectivity at the P3 position was significantly less stringent. However, several of the 
proteases showed a distinct preference for proline (cathepsin V, S, and papain), leucine 
(cathepsin B), or arginine (cathepsin S, cruzain). The proteases showed broad specificity at 

25 the P4 position, as no one amino acid was selected over others. 

Substrate recognition profiles 

To make a variant protease with an altered substrate recognition profile, the amino 
acids in the three-dimensional structure that contribute to the substrate selectivity (specificity 
determinants) are targeted for mutagenesis. For the serine proteases, numerous structures of 
30 family members have defined the surface residues that contribute to extended substrate 
specificity (Wang et aL, Biochemistry 2001 Aug 28;40(34): 10038-46; Hopfher et al. 9 
Structure Fold Des. 1999 Aug 15;7(8):989-96; Friedrich et aL J Biol Chem. 2002 Jan 
18;277(3):2 160-8; Waugh et aL, Nat Struct Biol. 2000 Sep;7(9):762-5). Structural 
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determinants for various proteases are listed in Table 3, along with a listing of the amino acid 
in a subset of family members determined to be of known, extended specificity. For serine 
proteases, the following amino acids in the primary sequence are determinants of specificity: 
195, 102, 57 (the catalytic triad); 189, 190,191, 192, and 226 (PI); 57, the loop between 58 
5 and 64, and 99 (P2); 192, 217, 218 (P3), the loop between Cysl68 and Cysl80, 215 and 97 to 
100 (P4). 
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Table 3. The structural determinants for various serine and cysteine proteases and their 

corresponding substrate specificities. 
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Granzyme B is a member of the family of chymotrypsin fold serine proteases, and has 
greater than 50% identity to other members of the granzyme family including granzymes C- 
5 G, cathepsin G, and rat mast cell protease II. The protein is a sandwich of two six stranded, 
anti-parallel P-barrel domains connected by a short a-helix. The catalytic triad is composed 
of Aspl02, His 57 and Ser 195. The surface loops are numbered according to the additions 
and deletions compared to a-chymotrypsin and represent the most variable regions of this 
structural family. The determinants of specificity are defined by the three-dimensional 

10 structure of rat granzyme B in complex with ecotin [IEPD], a macromolecular inhibitor with 
a substrate-like binding loop (Waugh et al. 9 Nature Struct. Biol). These structural 
determinants of specificity include Ile99, Argl92, Asn218, Tyr215, Tyrl74, Leul72, Arg226, 
and Tyr 1 5 1 , by chymotrypsin numbering. Interestingly, the other members of the granzyme 
family of serine proteases share only two of these amino acids with granzyme B. They are 

15 Tyr 215 and Leu 172, two residues that vary very little across the entire structural family. 
This suggests that while the sequence identity of the granzymes is high, their substrate 
specificities are very different. 
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To determine the role of these amino acids in extended specificity, Ile99, Argl92, 
Asn218 and Tyrl74 were mutated to the amino acid alanine. It was determined that Ile99 
contributes to P2 specificity, Asn218 and Argl92 to P3 specificity, and Tyrl74 to P4 
specificity. Each modified protease was profiled using a combinatorial substrate library to 
5 determine the effect of the mutation on extended specificity. Since the PI specificity of a 
protease represents the majority of its specificity, the modifications do not destroy unique 
specificity of granzyme B towards PI aspartic acid amino acids but modulate specificity in 
the extended P2 to P4 sites. 

For the P3 and P4 subsites, mutations at Tyrl74, Argl92 and Asn218 did not 

10 significantly affect the specificity (See Table 4, below). Y174A increases the activity 

towards Leu at P4, but the rest of the amino acids continue to be poorly selected. R192A and 
N218A both broaden the specificity at P3. Instead of a strong preference for glutamic acid, 
Ala, Ser, Glu and Gin are similarly preferred in the mutant. The overall activity (kcat/Km) of 
the mutant is less than 10% below the wild type activity toward an ideal wild- type substrate, 

15 N-acetyl-Ile-Glu-Pro-Asp-AMC (7-amino-4-methylcoumarin) (Ac-IEPD-AMC). 

A much more dramatic effect is observed at the P2 subsite (See Table 4, below). In 
wild type granzyme B, the preference is broad with a slight preference for Pro residues. I99A 
narrows the P2 specificity to Phe and Tyr residues. Phe is now preferred nearly 5 times over 
the average activity of other amino acids at this position. Within the chymotrypsin family of 

20 serine proteases, more than a dozen proteases have a small residue at this structural site, 
either an asparagine, serine, threonine, alanine or glycine. From this group, two proteases 
have been profiled using combinatorial substrate libraries, (plasma kallikrein and plasmin), 
and both show strong preferences towards Phe and Tyr. These two results suggest that any 
serine protease that is mutated to an Asn, Ser, Thr, Gly or Ala at position 99 will show the 

25 same hydrophobic specificity found in plasma kallikrein, plasmin and the I99A granzyme B 
mutant. 

The understanding of the P2 specificity determinants may be expanded to the 
contrasting mutation and substrate preference. Nearly two dozen chymotrypsin-fold serine 
proteases have an aromatic amino acid at position 99. Four of these proteases have been 
30 profiled using combinatorial substrate libraries: human granzyme B, tissue type plasminogen 
activator, urokinase type plasminogen activator, and membrane type serine protease 1 . All 
but granzyme B have a preference for serine, glycine and alanine amino acids at the substrate 
P2 position. 
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Table 4. Granzyme B Mutations 
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From Tables 4 and 5, the determinants of specificity selected to be altered in rat 
granzyme B are as follows: Serl95, Aspl02, His 57, Alal89, Serl90, Phel91, Argl92, 
Arg226, Ser 58, Gly59, Ser60, Lys61, Ile62, Asn63, Ile99, Gln217, Asn218, Glul69, Serl70, 
Tyrl71, Leul71A (note the one amino acid insertion as compared to chymotrypsin), Lysl72, 
Asnl73, Tyrl74, Phel75, Aspl76, Lysl77, Alal78, Asnl79, Glul80, Ilel81, Tyr215, Lys97, 
Thr98, Ue99, and Ser 100. 

For the cysteine proteases, the amino acids selected to be modified are less well 
described. The S2 pocket is the most selective and best characterized of the protease substrate 
recognition sites. It is defined by the amino acids at the following three-dimensional 
positions (papain numbering): 66, 67, 68, 133, 157, 160 and 205. Position 205 plays a role 
similar to position 1 89 in the serine proteases- a residue buried at the bottom of the pocket 
that determines the specificity. The other specificity determinants include the following 
amino acids (numbered according to papain): 61 and 66 (P3); 19, 20, and 158 (PI). 

Table 6. The structural determinants for various cysteine proteases and their 

corresponding substrate specificities. 
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2. Mutagenesis Of The Scaffold Protease 

In order to change the substrate preference of a given subsite (S 1 -S4) for a given 
amino acid, the specificity determinants that line the binding pocket are mutated, either 
individually or in combination. In one embodiment of the invention, a saturation 
mutatgenesis technique is used in which the residue(s) lining the pocket is mutated to each of 
the 20 possible amino acids. This can be accomplished using the Kunkle method (Current 
Protocols in Molecular Biology, John Wiley and Sons, Inc., Media Pa.). Briefly, a mutagenic 
oligonucleotide primer is synthesized which contains either NNS or NNK-randomization at 
the desired codon. The primer is annealed to the single stranded DNA template and DNA 
polymerase is added to synthesize the complementary strain of the template. After ligation, 
the double stranded DNA template is transformed into E. coli for amplification. 
Alternatively, single amino acid changes are made using standard, commercially available 
site-directed mutagenesis kits such as QuikChange (Stratagene). In another embodiment, any 
method commonly known in the art for site specific amino acid mutation could be used. 

3. Express And Purify The Variant Protease 

The protease may be expressed in an active or inactive, zymogen form. The protease 
may be in a heterologously expressing system such as E.coli, Pichia pastoris, S, cerevisae 9 or 
a baculovirus expression system. The protein can either be expressed in an intracellular 
environment or excreted into the media. The protease can also be expressed in an in vitro 
expression system. To purify the variant protease, column chromatography can be used. The 
protease may contain an C-terminal 6-His tag for purification on a Nickel column. 
Depending on the pi of the protease, a cation or anion exchange column may be appropriate. 
The protease can be stored in a low pH buffer that minimizes its catalytic activity so that it 
will not degrade itself. Purification can also be accomplished through immunoabsorption, gel 
filtration, or any other purification method commonly used in the art. 

4. Synthesis Of ACC Positional Scanning Libraries 

Those of skill in the art will recognize that many methods can be used to prepare the 
peptides and the libraries of the invention. In an exemplary embodiment, the library is 
screened by attaching a fluorogenically tagged substrate peptide to a solid support. The 
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fluorogenic leaving group from substrate peptide is synthesized by condensing an N-Fmoc 
coumarin derivative, to acid-labile Rink linker to provide ACC resin (Backes, et al Nat 
Biotechnol. 2000 Feb; 1 8(2): 1 87-93). Fmoc-removal produces a free amine. Natural, 
unnatural and modified amino acids can be coupled to the amine, which can be elaborated by 
5 the coupling of additional amino acids. After the synthesis of the peptide is complete, the 
peptide-fluorogenic moiety conjugate can be cleaved from the solid support or, alternatively, 
the conjugate can remain tethered to the solid support. 

Thus, in a further preferred embodiment, the present invention provides a method of 
preparing a fluorogenic peptide or a material including a fluorogenic peptide. The method 

10 includes: (a) providing a first conjugate comprising a fluorogenic moiety covalently bonded 
to a solid support; (b) contacting the first conjugate with a first protected amino acid moiety 
and an activating agent, thereby forming a peptide bond between a carboxyl group and the 
amine nitrogen of the first conjugate; (c) deprotecting, thereby forming a second conjugate 
having a reactive amine moiety; (d) contacting the second conjugate with a second protected 

1 5 amino acid and an activating agent, thereby forming a peptide bond between a carboxyl 

group and the reactive amine moiety; and (e) deprotecting, thereby forming a third conjugate 
having a reactive amine moiety. 

In a preferred embodiment, the method further includes: (f) contacting the third 
conjugate with a third protected amino acid and an activating agent, thereby forming a 

20 peptide bond between a carboxyl group and the reactive amine moiety; and (e) deprotecting, 
thereby forming a fourth conjugate having a reactive amine moiety. 

For amino acids that are difficult to couple (He, Val, etc), free, unreacted amine may 
remain on the support and complicate subsequent synthesis and assay operations. A 
specialized capping step employing the 3-nitrotriazole active ester of acetic acid in DMF 

25 efficiently acylates the remaining aniline. The resulting acetic acid-capped coumarin that 

may be present in unpurified substrate sequence solutions is generally not a protease substrate 
sequence. PI -substituted resins that are provided by these methods can be used to prepare 
any ACC-fluorogenic substrate. 

In a further preferred embodiment, diversity at any particular position or combination of 
30 positions is introduced by utilizing a mixture of at least two, preferably at least 6, more 

preferably at least 12, and more preferably still, at least 20, amino acids to grow the peptide 
chain. The mixtures of amino acids can include of any useful amount of a particular amino 
acid in combination with any useful amount of one or more different amino acids. In a 
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presently preferred embodiment, the mixture is an isokinetic mixture of amino acids (a 
mixture in appropriate ratios to allow for equal molar reactivity of all components). An 
isokinetic mixture is one in the molar ratios of amino acids has been adjusted based on their 
reported reaction rates. (Ostresh, J. M., Winkle, J. H., Hamashin, V. T., & Houghten, R. A. 
5 (1994). Biopolymers 34, 1681-1689). 

Solid phase peptide synthesis in which the C-terminal amino acid of the sequence is 
attached to an insoluble support followed by sequential addition of the remaining amino acids 
in the sequence is the preferred method for preparing the peptide backbone of the compounds 
of the present invention. Techniques for solid phase synthesis are described by Barany and 

10 Merrifield, Solid-Phase Peptide Synthesis; pp. 3-284 in The Peptides: Analysis, Synthesis, 
Biology . Vol. 2; Special Methods In Peptide Synthesis, Part A., Gross and Meienhofer, eds. 
Academic press, N.Y., 1980; and Stewart et al. 9 Solid Phase Peptide Synthesis, 2nd ed. Pierce 
Chem. Co., Rockford, 111. (1984) which are incorporated herein by reference. Solid phase 
synthesis is most easily accomplished with commercially available peptide synthesizers 

1 5 utilizing Fmoc or t-BOC chemistry. 

In a particularly preferred embodiment, peptide synthesis is performed using Fmoc 
synthesis chemistry. The side chains of Asp, Ser, Thr and Tyr are preferably protected using 
t-butyl and the side chain of Cys residue using S-trityl and S-t-butylthio, and Lys residues are 
preferably protected using t-Boc, Fmoc and 4-methyltrityl. Appropriately protected amino 

20 acid reagents are commercially available or can be prepared using art-recognized methods. 
The use of multiple protecting groups allows selective deblocking and coupling of a 
fluorophore to any particular desired side chain. Thus, for example, t-Boc deprotection is 
accomplished using TFA in dichloromethane. Fmoc deprotection is accomplished using, for 
example, 20% (v/v) piperidine in DMF or N-methylpyrolidone, and 4-methyltrityl 

25 deprotection is accomplished using, for example, 1 to 5% (v/v) TFA in water or 1% TFA and 
5% triisopropylsilane in DCM. S-t-butylthio deprotection is accomplished using, for 
example, aqueous mercaptoethanol (10%). Removal of t-butyl, t-boc and S-trityl groups is 
accomplished using, for example, TFA:phenol:water:thioanisol:ethanedithiol (85:5:5:2.5:2.5), 
or TFA:phenol:water (95:5:5). 

30 

5. Screen The Protease For Specificity Changes. 

Essential amino acids in the proteases generated using the methods of the present 
invention are identified according to procedures known in the art, such as site-directed 
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mutagenesis or saturation mutagenesis of active site residues. In the latter technique, residues 
that form the S1-S4 pockets that have been shown to be important determinants of specificity 
are mutated to every possible amino acid, either alone or in combination. See for example, 
Legendre, et al, JMB (2000) 296: 87-102. Substrate specificities of the resulting mutants 
5 will be determined using the ACC positional scanning libraries and by single substrate kinetic 
assays (Harris, et al PNAS, 2000, 97:7754-7759). 

Multiple amino acid substitutions are made and tested using known methods of 
mutagenesis and screening, such as those disclosed by Reidhaar-Olson and Sauer (Science 
241 :53-57, 1988) or Bowie and Sauer (Proc. Natl. Acad. Sci. USA 86:2152-2156, 1989). 

10 Briefly, these authors disclose methods for simultaneously randomizing two or more 

positions in a polypeptide, selecting for functional polypeptide, and then sequencing the 
mutagenized polypeptides to determine the spectrum of allowable substitutions at each 
position. Other methods that can be used include phage display (e.g., Legendre et al, JMB, 
2000: 296:87-102; Lowman et al, Biochem. 30:10832-10837, 1991; Ladner et al, U.S. Pat. 

1 5 No. 5,223,409; Huse, PCT Publication WO 92/06204) and region-directed mutagenesis 
(Derbyshire et al, Gene 46:145, 1986; Ner et al, DNA 7:127, 1988). 

Mutagenesis methods as disclosed above can be combined with high-throughput, 
automated screening methods to detect activity of cloned, mutagenized polypeptides in host 
cells. Mutagenized DNA molecules that encode proteolytically active proteins or precursors 

20 thereof are recovered from the host cells and rapidly sequenced using modern equipment. 
These methods allow the rapid determination of the importance of individual amino acid 
residues in a polypeptide of interest, and can be applied to polypeptides of unknown 
structure. 

25 Screening by Protease Phage Display 

In one embodiment protease phage display is used to screen pools of mutant proteases 
for various affinities to specific substrate sequences as described in Legendre et al, JMB, 
2000: 296:87-102, and Corey et al, Gene, 1993 Jun 15;128(l):129-34. The phage technique 
allows one to provide a physical link between a protein and the genetic information encoding 

30 it. The protein of interest is constructed as a genetic fusion to a surface coat protein of a 

bacterial virus. When the viral particle is produced in a bacterial host, the protein of interest 
is produced as a fusion protein and displayed on the surface of the virus, and its gene is 
packed within the capsid particle of the virus. Phage-displayed random protein libraries are 
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screened for binding to immobilized targets. Libraries of phage (with each phage 
representing an individual mutant) are sorted for enhanced affinity against the target. Serine 
proteases have been displayed on the surface of phage and this technique, coupled with a 
suitable mutagenesis technique, is used to generate a diverse library of protease variants. 
5 The target which is selected may be one related to a therapeutic application of the 

protease. For example, the target sequence is present in an endotoxin, or a viral protein, or a 
bacterial wall protein, or a native blood-born peptide related to an auto-immune condition. 
Here the protease selected is used in a treatment method, by administering the peptide, e.g., 
by intravenous administration, to a person in need of such treatment. 

1 0 Screening Using Fluorescence 

In another embodiment of the invention, a method of assaying for the presence of an 
enzymatically active protease. The method includes: (a) contacting a sample with a protease, 
in such a manner whereby a fluorogenic moiety is released from a peptide substrate sequence 
upon action of the protease, thereby producing a fluorescent moiety; and (b) observing 

1 5 whether the sample undergoes a detectable change in fluorescence, the detectable change 
being an indication of the presence of the enzymatically active protease in the sample. 

This method of the invention can be used to assay for substantially any known or later 
discovered protease. The sample containing the protease can be derived from substantially 
any source, or organism. In one embodiment, the sample is a clinical sample from a subject. 

20 In another embodiment, the protease is a member selected from the group consisting of 

aspartic protease, cysteine protease, metalloprotease and serine protease. The method of the 
invention is particularly preferred for the assay of proteases derived from a microorganism, 
including, but not limited to, bacteria, fungi, yeast, viruses, and protozoa. 

Assaying for protease activity in a solution simply requires adding a quantity of the 

25 stock solution to a fluorogenic protease indicator and measuring the subsequent increase in 
fluorescence or decrease in excitation band in the absorption spectrum. The solution and the 
fluorogenic indicator may also be combined and assayed in a "digestion buffer 11 that 
optimizes activity of the protease. Buffers suitable for assaying protease activity are well 
known to those of skill in the art. In general, a buffer is selected with a pH which 

30 corresponds to the pH optimum of the particular protease. For example, a buffer particularly 
suitable for assaying elastase activity consists of 50 mM sodium phosphate, 1 mM EDTA at 
pH 8.9. The measurement is most easily made in a fluorometer, an instrument that provides 
an "excitation" light source for the fluorophore and then measures the light subsequently 
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emitted at a particular wavelength. Comparison with a control indicator solution lacking the 
protease provides a measure of the protease activity. The activity level may be precisely 
quantified by generating a standard curve for the protease/indicator combination in which the 
rate of change in fluorescence produced by protease solutions of known activity is 
5 determined. 

While detection of the fluorogenic compounds is preferably accomplished using a 
fluorometer, detection may be accomplished by a variety of other methods well known to 
those of skill in the art. Thus, for example, when the fluorophores emit in the visible 
wavelengths, detection may be simply by visual inspection of fluorescence in response to 

10 excitation by a light source. Detection may also be by means of an image analysis system 

utilizing a video camera interfaced to a digitizer or other image acquisition system. Detection 
may also be by visualization through a filter, as under a fluorescence microscope. The 
microscope may provide a signal that is simply visualized by the operator. Alternatively, the 
signal may be recorded on photographic film or using a video analysis system. The signal 

1 5 may also simply be quantified in real time using either an image analysis system or a 
photometer. 

Thus, for example, a basic assay for protease activity of a sample involves suspending 
or dissolving the sample in a buffer (at the pH optima of the particular protease being 
assayed), adding to the buffer a fluorogenic protease indicators, and monitoring the resulting 

20 change in fluorescence using a spectrofluorometer as shown in Harris et aL, J Biol Chem, 
Vol. 273, Issue 42, 27364-27373, October 16, 1998. The spectrofluorometer is set to excite 
the fluorophore at the excitation wavelength of the fluorophore and to detect the resulting 
fluorescence at the emission wavelength of the fluorophore. The fluorogenic protease 
indicator is a substrate sequence of a protease that changes in fluorescence due to a protease 

25 cleaving the indicator. 

In an illustrative embodiment, the invention provides a library useful for profiling of 
various serine and cysteine proteases. The library is able to distinguish proteases having 
specificity for different amino acids. 

In another illustrative embodiment, a library is provided for probing the extended 

30 substrate sequence specificity of several serine proteases involved in blood coagulation, in 
which the PI position is held constant as either Lys or Arg, depending on the preferred Pl- 
specificity of the protease. 
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The PS-SCL strategy allows for the rapid and facile determination of proteolytic 
substrate sequence specificity. Those of skill in the art will appreciate that these methods 
provide a wide variety of alternative library formats. For example, fixing the P2 -position as a 
large hydrophobic amino acid may circumvent preferential internal cleavage by papain- fold 
5 proteases and lead to proper register of the substrate sequence. Determination and 

consideration of particular limitations relevant to any particular enzyme or method of 
substrate sequence specificity determination are within the ability of those of skill in the art. 

In addition to use in assaying for the presence of a selected enzyme, the method of the 
invention is also useful for detecting, identifying and quantifying an enzyme in a sample 

10 (e.g., protease). Thus, in another preferred embodiment, the screening method further 
includes, (c) quantifying the fluorescent moiety, thereby quantifying the enzyme (e.g. 
protease) present in the sample. The sample can be, e.g. a biological fluid, such as blood, 
serum, urine, tears, milk or semen 

Screening Using Protease Sequence Specificity Assay 

15 In another preferred embodiment, these methods are used select for an enzyme that 

specifically cleaves a target sequence, and preferably for an enzymatically active protease. 
The method includes: (a) a random peptide library containing an internally quenched 
fluorophore, where the fluorophore is e.g. o-aminobenzoyl and the quencher is e.g. 3- 
nitrotyrosine; (b) a peptide substrate sequence corresponding to the sequence targeted for 

20 cleavage, which also contains an internally quenched fluorophore where the fluorophore is 
e.g. Cy3B and the quencher is e.g. Cy5Q; (c) mixing the random peptide library and peptide 
substrate sequence at a 1 : 1 ratio; (d) exposing the mixture to the mutant protease and then 
quantitating the ratio of Cy3B fluorescence to o-aminobenzoyl fluorescence. If a protease is 
selective for the target peptide, it will cleave only the target peptide and not the random 

25 library, and thus there will be a high ratio of Cy3B fluorescence to o-aminobenzoyl 

fluorescence. (Meldal and Breddam, Anal. Biochem. (1991) 195: 141-147; Gron, et al 
Biochemistry (1992)31:6011-6018) 

In another preferred embodiment, these methods are used to determine the sequence 
specificity of an enzyme, and preferably of an enzymatically active protease. The method 

30 includes: (a) contacting the protease with a library of peptides of the invention in such a 
manner whereby the fluorogenic moiety is released from the peptide sequence, thereby 
forming a fluorescent moiety; (b) detecting the fluorescent moiety; and (c) determining the 
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sequence of the peptide sequence, thereby determining the peptide sequence specificity 
profile of the protease. 

In a preferred embodiment of the above-described method, the method further 
includes, (d) quantifying the fluorescent moiety, thereby quantifying the protease. 
5 Moreover, in each of the aspects and embodiments set forth hereinabove, the protease 

can be substantially any protease of interest, but is preferably aspartic protease, cysteine 
protease, metalloprotease or serine protease. The protease assayed using a method of the 
invention can be derived from substantially any organism, including, but not limited to, 
mammals (e.g. humans), birds, reptiles, insects, plants, fungi and the like. In a preferred 

10 embodiment, the protease is derived from a microorganism, including, but not limited to, 
bacteria, fungi, yeast, viruses, and protozoa. 
6. Iteration Of Steps i-5 

The method is repeated iteratively to create a variant protease that has the desired 
specificity and selectivity at each of the extended binding subsites, P2, P3, and P4. In some 

1 5 cases, mutations in serine proteases have shown that each of the subsites that form the active 
site (S1-S4) function independently of one another, such that modification of specificity at 
one subsite has little influence on specificity at adjacent subsites. Thus, engineering substrate 
specificity and selectivity throughout the extended binding site can be accomplished in a 
step-wise manner. 

20 Mutant proteases that match the desired specificity profiles, as determined by 

substrate libraries, are then assayed using individual peptide substrates corresponding to the 
desired cleavage sequence. Variant proteases are also assayed to ascertain that they will 
cleave the desired sequence when presented in the context of the full-length protein. The 
activity of the target protein is also assayed to verify that its function has been destroyed by 

25 the cleavage event. The cleavage event is monitored by SDS-PAGE after incubating the 
purified full-length protein with the variant protease. 

In another embodiment, mutant proteases are combined to acquire the specificity of 
multiple proteases. A mutation at one residue of a scaffold, which produces specificity at one 
site, is combined in the same protease with another mutation at another site on the scaffold to 

30 make a combined specificity protease. Any number of mutations at discrete sites on the same 
scaffold can be used to create a combined specificity protease. In one specific embodiment, a 
mutation in the granzyme B scaffold at position 99 from isoleucine to alanine was combined 
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with a mutation at position 2 1 8 of asparagine to alanine to create the combined specificity 
protease I99A/N218A granzyme B, the properties of which are detailed herein. 

Proteins targeted for cleavage and inactivation are identified by the following criteria: 
1) the protein is involved in pathology; 2) there is strong evidence the protein is the critical 
5 point of intervention for treating the pathology; 3) proteolytic cleavage of the protein will 
likely destroy its function. Cleavage sites within target proteins are identified by the 
following criteria: 1) they are located on the exposed surface of the protein; 2) they are 
located in regions that are devoid of secondary structure (i.e. not in P sheets or a helices), as 
determined by atomic structure or structure prediction algorithms; (these regions tend to be 

10 loops on the surface of proteins or stalks on cell surface receptors); 3) they are located at sites 
that are likely to inactivate the protein, based on its known function. Cleavage sequences are 
e.g. , four residues in length to match the extended substrate specificity of many serine 
proteases, but can be longer or shorter. 

In another embodiment of the invention, target protein-assisted catalysis is used to 

1 5 generate proteases specific for a target protein. In target protein-assisted catalysis, the 

invariant histidine that is part of the catalytic triad in a serine protease is mutated to alanine, 
rendering the protease inactive. A histidine in the proper position in the target protein could 
function as a hydrogen acceptor, in effect playing the same role as the mutated histidine in 
the protease, thereby restoring catalytic activity. However, this places a stringent 

20 requirement for having a histidine in the proper position in the substrate sequence (P2 or PI'). 
A single mutation in the substrate sequence binding site of the protease can alter its 
specificity and cause it to have a change in substrate sequence specificity. Substrate 
sequence specificity can be altered using a small number of mutations. 

Using the methods disclosed above, one of ordinary skill in the art can identify and/or 

25 prepare a variety of polypeptides that are substantially homologous to a protease scaffold or 
allelic variants thereof and retain the proteolytic properties of the wild-type protein. In one 
embodiment, these scaffolds comprise the amino acid sequences of trypsin, chymotrypsin, 
substilisin, thrombin, plasmin, Factor Xa, uPA, tPA, granzyme B, granzyme A, chymase, 
MTSP-1 , cathepsin G, elastase, papain, or cruzain. Such polypeptides may include a 

30 targeting moiety comprising additional amino acid residues that form an independently 

folding binding domain. Such domains include, for example, an extracellular ligand-binding 
domain (e.g., one or more fibronectin type III domains) of a cytokine receptor; 
immunoglobulin domains; DNA binding domains (see, e.g., He et al, Nature 378:92-96, 
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1995); affinity tags; and the like. Such polypeptides may also include additional polypeptide 
segments as generally disclosed above. 

Protease Polypeptides 

A polypeptide according to the invention includes a polypeptide including the amino 
5 acid sequence of a protease whose sequence is provided in any one of the scaffolds described 
herein. The invention also includes a mutant or variant protease any of whose residues may 
be changed from the corresponding residues shown in any one of the scaffolds described 
herein, while still encoding a protein that maintains its protease activities and physiological 
functions, or a functional fragment thereof. In a preferred embodiment, the mutations occur 

10 in the S1-S4 regions of the protease as detailed herein. 

In general, a protease variant that preserves protease-like function includes any 
variant in which residues at a particular position in the sequence have been substituted by 
other amino acids, and further include the possibility of inserting an additional residue or 
residues between two residues of the parent protein as well as the possibility of deleting one 

15 or more residues from the parent sequence. Any amino acid substitution, insertion, or 

deletion is encompassed by the invention. In favorable circumstances, the substitution is a 
conservative substitution as defined above. 

One aspect of the invention pertains to isolated proteases, and biologically-active 
portions thereof, or derivatives, fragments, analogs or homologs thereof. Also provided are 

20 polypeptide fragments suitable for use as immunogens to raise anti-protease antibodies. In 
another embodiment, proteases are produced by recombinant DNA techniques. Alternative 
to recombinant expression, a protease protein or polypeptide can be synthesized chemically 
using standard peptide synthesis techniques. 

Biologically-active portions of protease proteins include peptides comprising amino 

25 acid sequences sufficiently homologous to or derived from the amino acid sequences of the 
protease proteins that include fewer amino acids than the full-length protease proteins, and 
exhibit at least one activity of a protease protein. Typically, biologically-active portions 
comprise a domain or motif with at least one activity of the protease protein. A 
biologically-active portion of a protease protein is a polypeptide which is, for example, 10, 

30 25, 50, 60, 70, 80, 90, 100, 1 10, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 
240, 250, 260, 270, 280, 290, 300 or more amino acid residues in length. 
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Moreover, other biologically-active portions of a protein, in which other regions of 
the protein are deleted, can be prepared by recombinant techniques and evaluated for one or 
more of the functional activities of a native protease. 

In an embodiment, the protease has an amino acid sequence of one of the scaffolds 
5 described herein or one of the mutants of the scaffolds. The protease protein is substantially 
homologous to one of the scaffolds described herein or one of the mutants of the scaffolds, 
and retains the functional activity of the protein, yet differs in amino acid sequence due to 
natural allelic variation or mutagenesis. Accordingly, in another embodiment, the protease 
comprises an amino acid sequence at least about 45% homologous to the amino acid 

10 sequence of one of the scaffolds described herein or one of the mutants of the scaffolds, and 

retains the functional activity of one of the scaffolds described herein or one of the mutants of 
the scaffolds. In a preferred embodiment, the protease comprises an amino acid sequence at 
least about 90% homlogous to the amino acid sequence of one of the scaffolds. In another 
preferred embodiment, the protease comprises an amino acid sequence at least about 95% 

15 homlogous to the amino acid sequence of one of the scaffolds. In another preferred 

embodiment, the protease comprises an amino acid sequence at least about 99% homlogous 
to the amino acid sequence of one of the scaffolds. 

Determining Homology Between Two or More Sequences 

20 To determine the percent homology of two amino acid sequences or of two nucleic 

acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be 
introduced in the sequence of a first amino acid or nucleic acid sequence for optimal 
alignment with a second amino or nucleic acid sequence). The amino acid residues or 
nucleotides at corresponding amino acid positions or nucleotide positions are then compared. 

25 When a position in the first sequence is occupied by the same amino acid residue or 

nucleotide as the corresponding position in the second sequence, then the molecules are 
homologous at that position (i.e., as used herein amino acid or nucleic acid "homology" is 
equivalent to amino acid or nucleic acid "identity"). 

The nucleic acid sequence homology may be determined as the degree of identity 

30 between two sequences. The homology may be determined using computer programs known 
in the art, such as GAP software provided in the GCG program package. See, Needleman 
and Wunsch, 1970. JMol Biol 48: 443-453. Using GCG GAP software with the following 
settings for nucleic acid sequence comparison: GAP creation penalty of 5.0 and GAP 
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extension penalty of 0.3, the coding region of the analogous nucleic acid sequences referred 
to above exhibits a degree of identity preferably of at least 70%, 75%, 80%, 85%, 90%, 95%, 
98%, or 99%. 

The term "sequence identity" refers to the degree to which two polynucleotide or 
5 polypeptide sequences are identical on a residue-by-residue basis over a particular region of 
comparison. The term "percentage of sequence identity" is calculated by comparing two 
optimally aligned sequences over that region of comparison, determining the number of 
positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I, in the case of 
nucleic acids) occurs in both sequences to yield the number of matched positions, dividing 

10 the number of matched positions by the total number of positions in the region of comparison 
(i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence 
identity. The term "substantial identity" as used herein denotes a characteristic of a 
polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 
80 percent sequence identity, preferably at least 85 percent identity and often 90 to 95 percent 

15 sequence identity, more usually at least 99 percent sequence identity as compared to a 
reference sequence over a comparison region. 

Chimeric and Fusion Proteins 

The invention also provides protease chimeric or fusion proteins. As used herein, a 

20 protease "chimeric protein" or "fusion protein" comprises a protease polypeptide 

operatively-linked to a non-protease polypeptide. A " protease polypeptide" refers to a 
polypeptide having an amino acid sequence corresponding to one of the scaffolds described 
herein or one of the mutants of the scaffolds, whereas a "non-protease polypeptide" refers to a 
polypeptide having an amino acid sequence corresponding to a protein that is not 

25 substantially homologous to one of the scaffolds, e.g., a protein that is different from the 

scaffolds and that is derived from the same or a different organism. Within a protease fusion 
protein the protease polypeptide can correspond to all or a portion of a protease protein. In 
one embodiment, a protease fusion protein comprises at least one biologically-active portion 
of a protease protein. In another embodiment, a protease fusion protein comprises at least 

30 two biologically-active portions of a protease protein. In yet another embodiment, a protease 
fusion protein comprises at least three biologically-active portions of a protease protein. 
Within the fusion protein, the term "operatively-linked" is intended to indicate that the 
protease polypeptide and the non-protease polypeptide are fused in-frame with one another. 
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The non-protease polypeptide can be fused to the N-terminus or C-terminus of the protease 
polypeptide. 

In one embodiment, the fusion protein is a GST-protease fusion protein in which the 
protease sequences are fused to the N-terminus of the GST (glutathione S-transferase) 
5 sequences. Such fusion proteins can facilitate the purification of recombinant protease 
polypeptides. 

In another embodiment, the fusion protein is a Fc fusion in which the protease 
sequences are fused to the N-terminus of the Fc domain from immunoglobulin G. Such 
fusion proteins can have increased pharmacodynamic properties in vivo. 
1 0 In another embodiment, the fusion protein is a protease protein containing a 

heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host 
cells), expression and/or secretion of protease can be increased through use of a heterologous 
signal sequence. 

A protease chimeric or fusion protein of the invention can be produced by standard 
15 recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction 
enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
20 ligation. In another embodiment, the fusion gene can be synthesized by conventional 

techniques including automated DNA synthesizers. Alternatively, PCR amplification of gene 
fragments can be carried out using anchor primers that give rise to complementary overhangs 
between two consecutive gene fragments that can subsequently be annealed and reamplified 
to generate a chimeric gene sequence (see, e.g., Ausubel et al. (eds.) CURRENT PROTOCOLS IN 
25 Molecular Biology, John Wiley & Sons, 1992). Moreover, many expression vectors are 
commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 
protease-encoding nucleic acid can be cloned into such an expression vector such that the 
fusion moiety is linked in-frame to the protease protein. 

30 Protease Agonists and Antagonists 

The invention also pertains to variants of the protease proteins that function as either 
protease agonists (i.e., mimetics) or as protease antagonists. Variants of the protease protein 
can be generated by mutagenesis (e.g., discrete point mutation or truncation of the protease 
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protein). An agonist of the protease protein can retain substantially the same, or a subset of, 
the biological activities of the naturally occurring form of the protease protein. An antagonist 
of the protease protein can inhibit one or more of the activities of the naturally occurring form 
of the protease protein by, for example, cleaving the same target protein as the protease 
5 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited 
function. In one embodiment, treatment of a subject with a variant having a subset of the 
biological activities of the naturally occurring form of the protein has fewer side effects in a 
subject relative to treatment with the naturally occurring form of the protease proteins. 

10 Apoptosis 

Methods of Inhibiting Apoptosis 

Also included in the invention are methods inhibiting apoptosis. Apoptosis, also 
known as programmed cell death, plays a role in development, aging and in various 
pathologic conditions. In developing organisms, both vertebrate and invertebrate, cells die in 

15 particular positions at particular times as part of the normal morphogenetic process. The 

process of apoptosis is characterized by, but not limited to, several events. Cells lose their 
cell junctions and microvilli, the cytoplasm condenses and nuclear chromatin marginates into 
a number of discrete masses. As the nucleus fragments, the cytoplasm contracts and 
mitochondria and ribosomes become densely compacted. After dilation of the endoplasmic 

20 reticulum and its fusion with the plasma membrane, the cell breaks up into several 

membrane-bound vesicles, apoptotic bodies, which are usually phagocytosed by adjacent 
bodies. As fragmentation of chromatin into oligonucleotides fragments is characteristic of 
the final stages of apoptosis, DNA cleavage patterns can be used as an in vitro assay for its 
occurrence (Cory, Nature 367: 317-18, 1994). 

25 In one aspect, the invention provides a method of treating or preventing an apoptosis- 

associated disorder in a subject in need thereof by administering to the subject a 
therapeutically effective amount of a protease-inhibitor so apoptosis is inhibited. The subject 
can be e.g., any mammal, e.g., a human, a primate (e.g. human), mouse, rat, dog, cat, cow, 
horse, or pig. The term "therapeutically effective" means that the amount of protease- 

30 inhibitor, for example, which is used, is of sufficient quantity to ameliorate the apoptosis- 
associated disorder. 

An apoptosis associated disorder includes, for example, immunodeficiency diseases, 
including AIDS/HIV, senescence, neurodegenerative diseases, any degenerative disorder, 
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ischemic and reperfiision cell death, acute ischemic injury, infertility, wound-healing, and the 
like. 

Many methods for measuring apoptosis, including those described herein, are known 
to the skilled artisan including, but not limited to, the classic methods of DNA ladder 
5 formation by gel electrophoresis and of morphologic examination by electron microscopy. 
The more recent and readily used method for measuring apoptosis is flow cytometry. Flow 
cytometry permits rapid and quantitative measurements on apoptotic cells. Many different 
flow cytometric methods for the assessment of apoptosis in cells have been described 
(Darzynkiewicz et al, Cytometry 13: 795-808, 1992). Most of these methods measure 

10 apoptotic changes in cells by staining with various DNA dyes {i.e. propidium iodide (PI), 

DAPI, Hoechst 33342), however, techniques using the terminal deoxynucleotidyl transferase 
(TUNNEL) or nick translation assays have also been developed (Gorczyca et al, Cancer Res 
53: 1945-1951, 1993). Recently, rapid flow cytometric staining methods that use Annexin V 
for detection of phosphatidylserine exposure on the cell surface as a marker of apoptosis have 

1 5 become commercially available. The newest flow cytometric assays measure caspase-3 

activity, an early marker of cells undergoing apoptosis and kits for performing this assays are 
commercially available (Nicholson et al, Nature 376: 37-43, 1995). 

A protease can be administered to cleave a caspase thereby inhibiting its activity. In a 
preferred emboidment, the protease can be administered to cleave caspase-3. The protease 

20 can also cleave other proteins involved in apoptosis e.g. , human cytochrome c, human Apaf- 
1, human caspase-9, human caspase-7, human caspase-6, human caspase-2, human BAD, 
human BID, human BAX, human PARP, or human p53. By cleaving these proteins, the 
protease thereby inactivates them. In this manner the protease can be used to inhibit 
apoptosis. 

25 In another aspect apoptosis is inhibited in a cell by contacting a cell with a protease in 

an amount sufficient to inhibit apoptosis. The cell population that is exposed to, i.e., 
contacted with, the protease can be any number of cells, i.e., one or more cells, and can be 
provided in vitro, in vivo, or ex vivo. The cells are contacted with the protease protein, or 
transfected with a polynucleotide that encodes the protease. 

30 

Methods of Inducing Apoptosis 

Also included in the invention are methods of inducing apoptosis. In one aspect 
apoptosis is induced in subject in need thereof by administering a protease in an amount 
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sufficient to induce apoptosis. The subject can be e.g., any mammal, a primate (e.g., a 
human), mouse, rat, dog, cat, cow, horse, or pig. In various aspects the subject is susceptible 
to cancer or an autoimmune disorder. 

A protease can be administered with an anti-angiogenic compound. Examples of an 
5 anti-angiogenic compound include, but are not limited to, a tyrosine kinase inhibitor, an 
epidermal-derived growth factor inhibitor, a fibroblast-derived growth factor inhibitor, a 
platelet-derived growth factor inhibitor, a matrix metalloprotease (MMP) inhibitor, an 
integrin blocker, interferon alpha, interferon-inducible protein 10, interleukin-12, pentosan 
polysulfate, a cyclooxygenase inhibitor, a nonsteroidal anti-inflammatory (NSAID), a 

10 cyclooxygenase-2 inhibitor, carboxyamidotriazole, tetrahydrocortizol, combretastatin A-4, 
squalamine, 6-0-chloroacetyl-carbonyl)-fumagillol, thalidomide, angiostatin, endostatin, 
troponin- 1, an antibody to VEGF, platelet factor 4 or thrombospondin. 

In some embodiments, the protease is further administered with a chemotherapeutic 
compound. Examples of chemotherapeutic compounds include, but are not limited to, 

15 paclitaxel, Taxol, lovastatin, minosine, tamoxifen, gemcitabine, 5-fluorouracil (5-FU), 

methotrexate (MTX), docetaxel, vincristin, vinblastin, nocodazole, teniposide, etoposide, 
adriamycin, epothilone, navelbine, camptothecin, daunonibicin, dactinomycin, mitoxantrone, 
amsacrine, epirubicin or idarubicin. 

In another aspect, apoptosis is induced in a cell by contacting a cell with a protease in 

20 an amount sufficient to induce apoptosis. The cell population that is exposed to, i.e., 

contacted with, the protease can be any number of cells, i.e., one or more cells, and can be 
provided in vitro, in vivo, or ex vivo. The cells can be contacted with the protease protein, or 
transfected with a polynucleotide that encodes the protease. 

Some disease conditions are related to the development of a defective down- 

25 regulation of apoptosis in the affected cells. For example, neoplasias result, at least in part, 
from an apoptosis-resistant state in which cell proliferation signals inappropriately exceed 
cell death signals. Furthermore, some DNA viruses such as Epstein-Barr virus, African 
swine fever virus and adenovirus, parasitize the host cellular machinery to drive their own 
replication. At the same time, they modulate apoptosis to repress cell death and allow the 

30 target cell to reproduce the virus. Moreover, certain disease conditions such as 
lymphoproliferative conditions, cancer including drug resistant cancer, arthritis, 
inflammation, autoimmune diseases and the like may result from a down regulation of cell 
death regulation. In such disease conditions, it is desirable to promote apoptotic mechanisms. 
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EXAMPLES 



Example 1. Preparation and storage of I99A Granzyme B 

5 The wild type rat granzyme B construct was prepared as described previously (Harris 

et aL, JBC, 1998, (273):27364-27373). The following point mutations were introduced into 
the pPICZaA plasmid: N218A, N218T, N218V, I99A, I99F, I99R, Y174A, Y174V. Each 
mutation was confirmed by sequencing with primers to the 5'AOX and 3'AOX regions, 
followed by transformation into X33 cells and selection with Zeocin (Invitrogen, La Jolla 

10 CA). Expression and purification for each variant was identical to the previously described 
method for wild type rat granzyme B (Harris, et al. 9 JBC, 1998, (273):27364-27373). 

The protease rat granzyme B was mutated at He 99 to an Alanine using the 
QuikChange (Stratagene) method of site directed mutatgenesis. DNA primers to introduce 
the 199 A mutation were: Forward primer: CCA GCG TAT AAT TCT AAG AC A GCC TCC 

15 AAT GAC ATC ATG CTG (SEQ ID NO:3) Reverse primer: CAG CAT GAT GTC ATT 
GGA GGC TGT CTT AGA ATT ATA CGC TGG (SEQ ID NO:5). A polymerase chain 
reaction was made containing the wild type double stranded DNA, the two primers 
overlapping the mutation, a reaction buffer, dNTP's and the DNA polymerase. After 30 
rounds of annealing and amplification, the reaction was stopped. The enzyme Dpnl was 

20 added to digest the wild type DNA containing a modified base pair, and the resulting nicked 
DNA strand is transformed into bacteria. A selection against Zeocin ensures only positive 
clones with grow. The mutation was confirmed by sequencing the granzyme B gene. The 
same protocol was used to make the remaining granzyme B mutants, with appropriate 
changes in the mutagenic primers. 

25 The DNA containing the variant granzyme B proteases was transformed into Pichia 

pastoris X33 cells by the published protocol (Invitrogen) and the positive transformants were 
selected with Zeocin. The colony was transferred to a 1 L liquid culture and grown to a cell 
density of greater than OD600= 1 .0. Protein expression was induced by the addition of 0.5% 
methanol and held constant over 72 hours. To purify the variant protease, the culture was 

30 centrifuged and the supernatant collected. Gravity based loading flowed the supernatant over 
a SP-Sepharose Fast Flow cation exchange column. The column was washed with 50 mM 
Mes, pH 6.0, 100 mM NaCl, and more stringently with 50 mM MES, pH 6.0, 250 mM NaCl. 
The protein was eluted with 50 mM MES, pH 6.0, 1 M NaCl and the column washed with 50 



mM MES, pH 6.0, 2M NaCl and 0.5 M NaOH. The resulting protease was <90% pure. The 
final protease was exchanged and concentrated into 50mM MES, pH 6.0, 100 mM NaCl for 
storage at 4 °C. 

Alternatively, following purification, each variant was quantitated by absorbance at 
5 280nm (e280=13000 M-lcm-1), titrated with wildtype ecotin or M84D ecotin as previously 
described, exchanged into a buffer of 50 mM MES, pH 6.0 and 100 mM NaCl and stored at 4 
°C. 

Example 2. Synthesis and Screening of ACC Positional Scanning Libraries 

1 0 A CC-Resin Synthesis 

7-Fmoc-aminocoumarin-4-acetic acid was prepared by treating 7-aminocoumarin-4- 
acetic acid with Fmoc-Cl. 7-Aminocoumarin-4-acetic acid (10.0 g, 45.6 mmol) and H 2 0 (228 
ml) were mixed. NaHCC>3 (3.92 g, 45.6 mmol) was added in small portions followed by the 
addition of acetone (228 ml). The solution was cooled with an ice bath, and Fmoc-Cl (10.7 g, 

15 41.5 mmol) was added with stirring over the course of 1 h. The ice bath was removed and the 
solution was stirred overnight. The acetone was removed with rotary evaporation and the 
resulting gummy solid was collected by filtration and washed with several portions of hexane. 
ACC-resin was prepared by condensation of Rink Amide AM resin with 7-Fmoc- 
aminocoumarin-4-acetic acid. Rink Amide AM resin (21 g, 17 mmol) was solvated with 

20 DMF (200 ml). The mixture was agitated for 30 min and filtered with a filter cannula, 

whereupon 20% piperidine in DMF (200 ml) was added. After agitation for 25 min, the resin 
was filtered and washed with DMF (3 times, 200 ml each). 7-Fmoc-aminocoumarin-4-acetic 
acid (15 g, 34 mmol), HOBt (4.6 g, 34 mmol), and DMF (150 ml) were added, followed by 
diisopropylcarbodiimide (DICI) (5.3 ml, 34 mmol). The mixture was agitated overnight, 

25 filtered, washed (DMF, three times with 200 ml; tetrahydrofuran, three times with 200 ml; 

MeOH, three times with 200 ml), and dried over P2O5. The substitution level of the resin was 
0.58 mmol/g (>95%) as determined by Fmoc analysis. 
Pi-Diverse Library Synthesis 

Individual PI -substituted Fmoc-amino acid ACC-resin (-25 mg, 0.013 mmol) was 

30 added to wells of a MultiChem 96-well reaction apparatus. The resin-containing wells were 
solvated with DMF (0.5 ml). After filtration, 20% piperidine in DMF solution (0.5 ml) was 
added, followed by agitation for 30 min. The wells of the reaction block were filtered and 
washed with DMF (three times with 0.5 ml). To introduce the randomized P2 position, an 
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isokinetic mixture of Fmoc-amino acids [4.8 mmol, 10 eq per well; Fmoc-amino acid, mol %: 
Fmoc-Ala-OH, 3.4; Fmoc-Arg(Pbf)-OH 3 6.5;Fmoc-Asn(Trt)-OH, 5.3; Fmoc-Asp(Of-Bu)- 
OH, 3.5; Fmoc-Glu(0-f-Bu)-OH,3.6; Fmoc-Gln(Trt)-OH, 5.3; Fmoc-Gly-OH, 2.9; Fmoc- 
His(Trt)-OH,3.5; Fmoc-Ile-OH, 17.4; Fmoc-Leu-OH, 4.9; Fmoc-Lys(Boc)-OH, 6.2;Fmoc- 
5 Nle-OH, 3.8; Fmoc-Phe-OH, 2.5; Fmoc-Pro-OH, 4.3; Fmoc-Ser(0-/-Bu)-OH, 2.8; Fmoc- 
Thr(0-/-Bu)-OH, 4.8; Fmoc-Trp(Boc)-OH, 3.8; Fmoc-Tyr(0-M3u)-OH,4.1; Fmoc-Val-OH, 
1 1 .3] was preactivated with DICI (390 jil, 2.5 mmol), and HOBt (340 mg, 2.5 mmol) in DMF 
(10 ml). The solution (0.5 ml) was added to each of the wells. The reaction block was 
agitated for 3 h, filtered, and washed with DMF (three times with 0.5 ml). The randomized 

10 P3 and P4 positions were incorporated in the same manner. The Fmoc of the P4 amino acid 
was removed and the resin was washed with DMF (three times with 0.5 ml) and treated with 
0.5 ml of a capping solution of AcOH (150 2.5 mmol), HOBt (340 mg, 2.5 mmol), and 
DICI (390 pi, 2.5 mmol) in DMF (10 ml). After 4 h of agitation, the resin was washed with 
DMF (three times with 0.5 ml) and CH2CI2 (three times with 0.5 ml), and treated with a 

15 solution of 95:2.5:2.5 TFA/TIS/H2O. After incubation for 1 h the reaction block was opened 
and placed on a 96-deep-well titer plate and the wells were washed with additional cleavage 
solution (twice with 0.5 ml). The collection plate was concentrated, and the material in the 
substrate-containing wells was diluted with EtOH (0.5 ml) and concentrated twice. The 
contents of the individual wells were lyophilized from CH3CN/H2O mixtures. The total 

20 amount of substrate in each well was conservatively estimated to be 0.0063 mmol (50%) on 
the basis of yields of single substrates. 
Pi-Fixed Library Synthesis 

Multigram quantities of PI -substituted ACC-resin could be synthesized by the 
methods described. Fmoc-amino acid-substituted ACC resin was placed in 57 wells of a 96- 

25 well reaction block: sublibraries were denoted by the second fixed position (P4, P3, P2) of 19 
amino acids (cysteine was omitted and norleucine was substituted for methionine). Synthesis, 
capping, and cleavage of the substrates were identical to those described in the previous 
section, with the exception that forP2, P3, and P4 sublibraries, individual amino acids (5 eq 
of Fmoc-amino acid monomer, 5 eq of DICI, and 5 eq of HOBt in DMF), rather than 

30 isokinetic mixtures, were incorporated in the spatially addressed P2, P3, or P4 positions. 

Preparation of the complete diverse and PI -fixed combinatorial libraries was carried 
out as described above. The library was aliquoted into 96-well plates to a final concentration 
of 250 \\M. Variant proteases were diluted in granzyme activity buffer (50 mM Na Hepes, 
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pH 8.0, 100 mM NaCl, 0.01% Tween-20) to concentrations between 50 nM and 1 ^M. Initial 
activity against Ac-IEPD-AMC was used to adjust the variant protease concentration to one 
approximately equal to 50 nM wild type rat granzyme B. Enzymatic activity in the PI -Asp 
library was assayed for one hour at 30 °C on a Spectra-Max Delta flourimeter (company 
5 name). Excitation and emission were measured at 380 nm and 460 nm, respectively. 

Example 3. Individual Kinetic Measurements of I99A Granzyme B 

Individual kinetic measurements were performed using a Spectra-Max Delta 
fluorimeter. Each protease was diluted to between 50 nM and 1 jiM in assay buffer. All 
10 ACC substrates were diluted with MeSO to between 5 and 500 jiM, while AMC substrates 
were diluted to between 20 and 2000 jjM. Each assay contained less than 5% MeSO. 
Enzymatic activity was monitored every 15 seconds at excitation and emission wavelengths 
of 380 nm and 460 nm, respectively, for a total of 10 minutes. All assays were performed in 
l%DMSO. 

1 5 This method was used to screen 199 A granzyme B. I99A granzyme was profiled in a 

positional scanning combinatorial substrate library to determine the effect of the mutation. 
The library was prepared as described above and aliquoted into 96-well plates to a final 
concentration of 250 |iM. The variant protease was diluted in granzyme activity buffer (50 
mM Na Hepes, pH 8.0, 100 mM NaCl, 0.01% Tween-20) to concentrations between 50 nM 

20 and 1 |oM. Initial activity against Ac-IEPD-AMC was used to adjust the variant protease 
concentration to one approximately equal to 50 nM wild type rat granzyme B. Enzymatic 
activity in the Pi-Asp library was assayed for one hour at 30 °C on the Spectra-Max Delta 
fluorimeter. Excitation and emission were measured at 380 nm and 460 nm, respectively. 
The profiles of the granzyme B variants were compared to the wild type profile and the 

25 differences determined. For the I99A mutant, for example, the specificity at the P2 amino 
acid was markedly changed from the wild type. The former broad preference with a slight 
preference for proline is replaced with a strong preference for hydrophobic residues such as 
Phe and Tyr. The selectivity of the variant protease was also changed. The wild type was 
promiscuous at the P2 subsite, hydrolyzing substrates that contain any amino acid at that site. 

30 The I99A protease is much more selective. A Phe at the P2 site is preferred to a much higher 
degree than any other amino acid (See Table 5, above). 
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Example 4. Proteolytic cleavage and inactivation of tumor necrosis factor and tumor 
necrosis factor receptor. 
Receptor cleavage. 

Freshly isolated neutrophils (PMN) were resuspended at lxlO 7 cells/ml in RPMI 1640 
5 with 0.2% fetal calf serum (FCS) and incubated with various concentrations of protease, 

specific for the stalk region of TNF-R1 or TNF-R2. After a 1 to 40 min incubation at 37 °C, 
protease inhibitors were added to stop the reaction and the amount of TNF-R released into the 
media was quantitated using ELISA (Roche). 
TNF cleavage. 

1 0 1 25 I-TNF (40,000 cpm) was incubated with varying concentrations of protease and 

then samples were boiled in SDS-PAGE sample buffer and examined on a 12% 
polyacrylamide gel. Gels were dried and exposed to x-ray film(Kodak) at -70 °C. 
TNF binding assay. 

I-TNF or PMN were incubated with varying concentrations of proteases as above. 

1 5 The binding of 125 I-TNF exposed to proteases to normal PMN, or the binding of normal 125 I- 
TNF to PMN exposed to proteases, was quantitated using scintillation. Briefly, 10 5 cells 
were incubated with varying concentrations of 125 I-TNF in 96-well filter plates (Millipore) in 
the presence of protease inhibitors. Cells were then washed three times by vacuum aspiration 
and then 30 jul of scintillation fluid (Wallac) was added to each well. Scintillation was then 

20 counted on a Wallac Microbeta scintillation counter. (Adapted from van Kessel et al. 9 J. 
Immunol. (1991) 147: 3862-3868 and Porteau et al , JBC (1991)266:18846-18853). 



Example 5. Selection of Enzymes Capable of Peptide Sequence Specific Target 
25 Cleavage Using Protease Phage Display 

The phagemid is constructed such that it (i) carries all the genes necessary for Ml 3 
phage morphogenesis; (ii) it carries a packaging signal which interacts with the phage origin 
of replication to initiate production of single-stranded DNA; (iii) it carries a disrupted phage 
origin of replication; and (iv) it carries an ampicillin resistance gene. 
30 The combination of an inefficient phage origin of replication and an intact plasmid 

origin of replication favors propagation of the vector in the host bacterium as a plasmid (as 
RF, replicating form, DNA) rather than as a phage. It can therefore be maintained without 
killing the host. Furthermore, possession of a plasmid origin means that it can replicate 

45 



independent of the efficient phage-like propagation of the phagemid. By virtue of the 
ampicillin resistance gene, the vector can be amplified which in turn increases packaging of 
phagemid DNA into phage particles. 

Fusion of the protease gene to either the gene 3 or gene 8 Ml 3 coat proteins can be 
5 constructed using standard cloning methods. (Sidhu, Methods in Enzymology, 2000, V328, 
p333) . A combinatorial library of variants within the gene encoding the protease is then 
displayed on the surface of Ml 3 as a fusion to the p3 or p8 Ml 3 coat proteins and panned 
against an immobilized, aldehyde-containing peptide corresponding to the target cleavage of 
interest. The aldehyde moiety will inhibit the ability of the protease to cleave the scissile 

10 bond of the protease, however this moiety does not interfere with protease recognition of the 
peptide. Variant protease-displayed phage with specficity for the immobilized target 
peptide will bind to target peptide coated plates, whereas non-specific phage will be washed 
away. Through consecutive rounds of panning, proteases with enhanced specificty towards 
the target sequence can be isolated. The target sequence can then be synthesized without the 

1 5 aldehyde and isolated phage can be tested for specific hydrolyis of the peptide. 

Example 6. The synthesis and fluorescence screening of libraries. 
A. Pl-diverse Library 
A(i). Synthesis 

20 Individual PI -substituted Fmoc-amino acid ACC -resin (ca. 25 mg, 0.013 mmol) was 

added to wells of a Multi-Chem 96-well reaction apparatus. The resin-containing wells were 
solvated with DMF (0.5 mL). A 20% piperidine in DMF solution (0.5 mL) was added 
followed by agitation for 30 min. The wells of the reaction block were filtered and washed 
with DMF (3x0.5 mL). In order to introduce the randomized P2 position, an isokinetic 

25 mixture (Ostresh, J. M., et ai, (1994) Biopolymers 34:1681-9) of Fmoc-amino acids (4.8 
mmol, 10 equiv/well; Fmoc-amino acid, mol %: Fmoc-Ala-OH, 3.4; Fmoc-Arg(Pbf)-OH, 
6.5; Fmoc-Asn(Trt)-OH, 5.3; Fmoc-Asp(0-t-Bu)-OH, 3.5; Fmoc-Glu(0-t-Bu)-OH, 3.6; 
Fmoc-Gln(Trt)-OH, 5.3; Fmoc-Gly-OH, 2.9; Fmoc-His(Trt)-OH, 3.5; Fmoc-Ile-OH, 17.4; 
Fmoc-Leu-OH, 4.9; Fmoc-Lys(Boc)-OH, 6.2; Fmoc-Nle-OH, 3.8; Fmoc-Phe-OH, 2.5; Fmoc- 

30 Pro-OH, 4.3; Fmoc-Ser(0-t-Bu)-OH, 2.8; Fmoc-Thr(0-t-Bu)-OH, 4.8; Fmoc-Trp(Boc)-OH, 
3.8; Fmoc-Tyr(Q-t-Bu)-OH, 4.1; Fmoc-Val-OH, 1 1.3) was pre-activated with DICI (390 nL, 
2.5 mmol), and HOBt (340 mg, 2.5 mmol) in DMF (10 mL). The solution (0.5 mL) was 
added to each of the wells. The reaction block was agitated for 3 h, filtered, and washed with 
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DMF (3.times.0.5 mL). The randomized P3 and P4 positions were incorporated in the same 
manner. The Fmoc of the P4 amino acid was removed and the resin was washed with DMF 
(3x 0.5 mL), and treated with 0.5 mL of a capping solution of AcOH (150 |iL, 2.5 mmol), 
HOBt (340 mg, 2.5 mmol) and DICI (390 |i L, 2.5 mmol) in DMF (10 mL). After 4 h of 
5 agitation, the resin was washed with DMF (3x 0.5 mL), CH2CI2 (3x 0.5 mL), and treated with 
a solution of 95:2.5:2.5 TFA/TIS/H2O. After incubating for 1 h the reaction block was 
opened and placed on a 96 deep-well titer plate and the wells were washed with additional 
cleavage solution (2x 0.5 mL). The collection plate was concentrated, and the substrate- 
containing wells were diluted with EtOH (0.5 mL) and concentrated twice. The contents of 
10 the individual wells were lyophilized from CH3CN:H 2 0 mixtures. The total amount of 

substrate in each well was conservatively estimated to be 0.0063 mmol (50%) based upon 
yields of single substrates. 

A(ii). Enzymatic Assay of Library 

The concentration of proteolytic enzymes was determined by absorbance measured at 
15 280 nm (Gill, S. C, et aL 9 (1989) Anal Biochem 182:319-26). The proportion of catalytically 
active thrombin, plasmin, trypsin, uPA, tPA, and chymotrypsin was quantitated by active-site 
titration with MUGB or MUTMAC (Jameson, G. W., et al. 9 (1973) Biochemical Journal 
131:107-117). 

Substrates from the PS-SCLs were dissolved in DMSO. Approximately 1 .OxlO' 9 mol 
20 of each Pl-Lys, Pl-Arg, or Pl-Leu sub-library (361 compounds) was added to 57 wells of a 
96-well microfluor plate (Dynex Technologies, Chantilly, Va.) for a final concentration of 0.1 
|iM. Approximately l.OxlO" 10 mol of each PI -diverse sub-library (6859 compounds) was 
added to 20 wells of a 96-well plate for a final concentration of 0.01 jxM in each compound. 
Hydrolysis reactions were initiated by the addition of enzyme (0.02 nM-100 nM) and 
25 monitored fluorometrically with a Perkin Elmer LS50B Luminescence Spectrometer, with 

excitation at 380 nm and emission at 450 nm or 460 nm. Assays of the serine proteases were 
performed at 25 °C. in a buffer containing 50 mM Tris, pH 8.0, 100 mM NaCl, 0.5 mM 
CaCh, 0.01% Tween-20, and 1% DMSO (from substrates). Assay of the cysteine proteases, 
papain and cruzain, was performed at 25 °C. in a buffer containing 100 mM sodium acetate, 
30 pH 5.5, 100 mM NaCl, 5 mM DTT, 1 mM EDTA, 0.01% Brij-35, and 1% DMSO (from 
substrates). 

B. Profiling Proteases with a PI -diverse Library of 137.180 Substrate Sequences 
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To test the possibility of attaching all amino acids to the PI -site in the substrate 
sequence a PI -diverse tetrapeptide library was created. The PI -diverse library consists of 20 
wells in which only the PI -position is systematically held constant as all amino acids, 
excluding cysteine and including norleucine. The P2, P3, and P4 positions consist of an 
5 equimolar mixture of all amino acids for a total of 6,859 substrate sequences per well. 

Several serine and cysteine proteases were profiled to test the applicability of this library for 
the identification of the optimal PI amino acid. Chymotrypsin showed the expected 
specificity for large hydrophobic amino acids. Trypsin and thrombin showed preference for 
PI -basic amino acids (Arg>Lys). Plasmin also showed a preference for basic amino acids 

10 (Lys>Arg). Granzyme B, the only known mammalian serine protease to have PI -Asp 
specificity, showed a distinct preference for aspartic acid over all other amino acids, 
including the other acidic amino acid, Glu. The PI -profile for human neutrophil elastase has 
the canonical preference for alanine and valine. The cysteine proteases, papain and cruzain 
showed the broad PI -substrate sequence specificity that is known for these enzymes, 

15 although there is a modest preference for arginine. 

Example 7. Screening for cleavage of individual substrates 

Mutant proteases that match the desired specificity profiles, as determined by 
substrate libraries, are assayed using individual peptide substrates corresponding to the 

20 desired cleavage sequence. Individual kinetic measurements are performed using a Spectra- 
Max Delta fluorimeter (Molecular Devices). Each protease is diluted to between 50 nM and 
1 jiiM in assay buffer. All ACC substrates are diluted with MeSO to between 5 and 500 |iM, 
while AMC substrates are diluted to between 20 and 2000 juM. Each assay contain less than 
5% MeSO. Enzymatic activity is monitored every 15 seconds at excitation and emission 

25 wavelengths of 380 nm and 460 nm, respectively, for a total of 10 minutes. All assays are 
performed in 1% DMSO. 

Example 8. Screening for cleavage of full-length proteins 

Variant proteases are assayed to ascertain that they will cleave the desired sequence 
30 when presented in the context of the full-length protein, and the activity of the target protein 
is assayed to verify that its function has been destroyed by the cleavage event. The cleavage 
event is monitored by SDS-PAGE after incubating the purified full-length protein with the 
variant protease. The protein is visualized using standard Coomasie blue staining, by 
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autoradiography using radio labeled protein, or by Western blot using the appropriate 
antibody. Alternatively, if the target protein is a cell surface receptor, cells expressing the 
target protein are exposed to the variant protease. The cleavage event is monitored by lysing 
the cells and then separating the proteins by SDS-PAGE, followed by visualization by 
5 Western blot. Alternatively, the soluble receptor released by proteolysis is quantified by 
ELISA. 

The cleavage of the tumor necrosis factor receptors 1 and 2 (TNF-R1 and TNF-R2) 
were measured using these techniques. Freshly isolated neutrophils (PMN) were resuspended 
at lxlO 7 cells/ml in RPMI 1640 with 0.2% fetal calf serum (FCS) and incubated with various 

10 concentrations of protease, specific for the stalk region of TNF-R1 or TNF-R2. After an 

incubation of 1 to 40 min at 37 °C, protease inhibitors were added to stop the reaction and the 
amount of TNF-R released into the media was quantified using ELISA (Roche). 

Although the invention has been described with respect to specific methods of making 
and using enzymes capable of cleaving target polypeptide sequences, it will be apparent that 

1 5 various changes and modifications may be made without departing from the invention. 
Cleavage OfTNF. 

125 I-TNF (40,000 cpm) was incubated with varying concentrations of protease, 
samples were boiled in SDS-PAGE sample buffer and examined on a 12% polyacrylamide 
gel. The gels were dried and exposed to x-ray film(Kodak) at -70 °C. 

20 TNF Binding Assay. 

I-TNF or PMN were incubated with varying concentrations of proteases as above. 
The binding of 125 I-TNF exposed to proteases to normal PMN, or the binding of normal 125 I- 
TNF to PMN exposed to proteases, was quantified using scintillation. Briefly, 10 5 cells were 
incubated with varying concentrations of I25 I-TNF in 96-well filter plates (Millipore) in the 

25 presence of protease inhibitors. Cells were washed three times by vacuum aspiration and 30 
|xL of scintillation fluid (Wallac) was added to each well. Scintillation was counted on a 
Wallac Microbeta scintillation counter. (Adapted from van Kessel et al., J. Immunol. (1991) 
147: 3862-3868 and Porteau etal, JBC (1991) 266:18846-18853). 



30 Example 9. Identification of target proteins and cleavage sites therein 

Proteins targeted for cleavage and inactivation are identified by the following criteria: 
1) the protein is involved in pathology; 2) there is strong evidence the protein is the critical 
point of intervention for treating the pathology; 3) proteolytic cleavage of the protein will 
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likely destroy its function. Cleavage sites within target proteins are identified by the 
following criteria: 1) they are located on the exposed surface of the protein; 2) they are 
located in regions that are devoid of secondary structure (i.e. P sheets or a helices), as 
determined by atomic structure or structure prediction algorithms; these regions tend to be 
5 loops on the surface of proteins or stalks on cell surface receptors; 3) they are located at sites 
that are likely to inactivate the protein, based on its known function. Cleavage sequences can 
be four residues in length to match the extended substrate specificity of many serine 
proteases, but can be longer or shorter. 

Tumor Necrosis Factor And Tumor Necrosis Factor Receptor 
10 Tumor necrosis factor (TNF) is a pro-inflammatory cytokine that is primarily 

produced by monocytes, macrophages, and lymphocytes. TNF initiates signal transduction 
by interacting with either of two surface bound receptors, the p55 tumor necrosis factor 
receptor (TNF-R1) and the p75 tumor necrosis factor receptor (TNF-R2). TNF plays a 
central part in the pathophysiology of rheumatoid arthritis (RA), and is found at high 
1 5 concentrations of the synovium and synovial fluid of patients with RA. TNF signaling events 
result in the production of other pro-inflammatory cytokines (II- 1, 11-6, GM-CSF), induces 
the production of metalloproteinases such as collagenase and stromelysin, and increases the 
proliferation and activity of osteoclasts; all of these events lead to synovitis and tissue and 
bone destruction. Both types of TNF receptors are shed from the cell's surface as soluble 
20 forms that retain their ligand binding ability. These soluble TNFRs can neutralize TNF 

activity both in vitro and in vivo, and are believed to act as natural inhibitors to attenuate TNF 

signaling. 

VEGFR. 

Vascular endothelial growth factor (VEGF) is an endothelial cell-specific mitogen 
25 normally produced during embryogenesis and adult life. VEGF is a significant mediator of 
angiogenesis in a variety of normal and pathological processes, including tumor 
development. Tumor vascularization is a vital process for the progression of a tumor to a 
stage from which it can metastasize. Three high affinity cognate receptors to VEGF have 
been identified: VEGFR- 1 /Fit- 1, VEGFR-2/KDR, and VEGFR-3/Flt-4. VEGFRs are cell 
30 surface receptor tyrosine kinases that function as signaling molecules during vascular 
development. 
EGFR AND HER-2. 
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The ErbB family of receptor tyrosine kinases comprise four members: EGFR (Her-1), 
ErbB2 (Her-2), ErbB3 (Her-3) and ErbB4 (Her-4). All are essential for normal development 
and participate in the functioning of normal cells. ErbB receptors, particularly EGFR and 
ErbB2 are commonly deregulated in certain prevalent forms of human cancer. Dysregulation 
5 of ErbB signaling occurs by various mechanisms, including gene amplification, mutations 

that increase receptor transcription, or mutations that increase receptor activity. Activation of 
the ErbB receptors through the binding of the epidermal growth factor (EGF) results in 
downstream signaling through the mitogen-activated protein kinase (MAPK) and the 
Akt/phosphoinositide 3 -kinase (PI3 -kinase) pathways, ultimately leading to cell proliferation, 
10 differentiation, and angiogenesis. 

Potential Proteolytic Cleavage Sites* 

Proteolytic cleavage sites for the proteins described above are shown below in Table 

7. 

15 

Table 7. Cleavage sequences for selected disease related proteins 



Target 


Cleavage 
sequence 


Region 


Indication 


TNF-a 


AEAK 


loop 


Rheumatoid arthritis, 
Crohn's disease, 
Inflammatory bowel disease, 
Psoriasis 


TNF-R1 


ENVK 
GTED 


stalk 
stalk 


TNF-R2 


SPTR 
VSTR 
STSF 


stalk 
stalk 
stalk 


HER-2 


KFPD 
AEQR 


stalk 
stalk 


Breast cancer 


EGFR 


KYAD 
NGPK 


stalk 
stalk 


Lung, breast, bladder, 
prostate, colorectal, kidney, 
head & neck cancer 


VEGFR-1 


SSAY 
GTSD 


stalk 
stalk 


VEGFR-2 


AQEK 
RIDY 


stalk 
loop 



20 Example 11. Cleavage Profile of Granzyme B I99A/N218A Mutant 
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Figure 1 shows the sequence of caspase-3, a protein implicated in the apoptosis 
pathway of many cell types. Wild-type granzyme B cleaves caspase 3 between the aspartate 
and serine at residues 175 and 176 respectively. Mutations at positions 99 and 218 of 
granzyme B, change the specificity of this protease to cleavage between the aspartate and 
5 alanine of residues 263 and 264 respectively. 

Figure 2 shows a crystallographic model of caspase-3 focusing on the inactivation 
sequence that is cleaved by I99A/N218A granzyme B at residues 260-265 (SEQ ID NO:2). 
The specificity of the mutated granzyme B is shown in Figure 3 A with varying residues at the 
P2, P3 and P4 positions in a PS-SCL library. PS-SCL libraries in the form P4-P3-P2-Asp- 

10 AMC were assayed with wild-type and N218A/I99A granzyme B and the substrate 

specificity at each position is plotted by amino acid. The mutant showed (Figure 3B) a five 
times greater preference for phenylalanine at the P2 position over proline in the wild-type. 
Also, the mutant accommodates large hydrophobic amino acids including phenylalanine and 
leucine at the P4 position, where the wild-type generally accommodates only isoleucine and 

15 valine. 

Figure 4 shows the cleavage of the NH 2 -FSFDAT-COOH (SEQ ID NO:2) the 
caspase-3 inactivation sequence, made up of residues 260-265 of caspase-3, by MALDI mass 
spectrometry. The inactivation sequence was incubated with 1 00 nM wildtype granzyme B 
or 1 |xM I99A/N218A for 18 hours. The first panel shows the molecular weight of the 

20 peptide alone. Shown is a peak representing the correct molecular weight for the uncleaved 
peptide. The second panel shows the results of the peptide being mixed with wild type 
granzyme B. The peak, again, represents the correct molecular weight for the uncleaved 
peptide, showing that wild-type granzyme B does not cleave the peptide. The third panel 
shows the results for the peptide with the N2 18 A/199 A mutant granzyme B. Here, the peak 

25 has shifted to representing a cleavage product at 538.04, representing a cleavage product of 
the appropriate size (538 Da), for the cleaved peptide. The mutant granzyme B efficiently 
cleaves the peptide. 

Figure 5 shows the results from three individual reactions run on an SDS-PAGE gel. 
Three individual tubes containing approximately 50 jiM of caspase-3 were incubated in the 
30 presence of: in lane 1 : buffer only; lane 2: wild-type granzyme B; lane 3: granzyme B 

I99A/N2 1 8 A. Each reaction was terminated by the addition of 2X SDS sample buffer, then 
heated to 95 °C, and run on a tricine gel. The first lane shows caspase-3 alone. The second 
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lane, shows caspase-3 with wild-type granzyme B. The third lane shows caspase-3 with the 
mutant granzyme B. The mutant is able to cleave the small subunit of caspase-3. 

I99A/N218A granzyme B cleaved and inactivated full length caspase-3. Purified 
caspase-3 (2 |iM) was incubated with no protease, 100 nM of wildtype granzyme B, or 1 |jM 
5 I99A/N218A granzyme B for 18 hours in granzyme B activity buffer. 10 |iL of each reaction 
was diluted in 90 \iL of caspase-3 activity buffer and caspase-3 activity was assayed by 
cleavage of Ac-DEVD-AMC. Figure 6A shows a graph of caspase-3 activity plotted against 
time. I99A/N218A granzyme B inactivated caspase-3 to a very low level of activity. Wild- 
type granzyme B inactivated caspase-3 more than control, but did not have the effect that the 
10 mutant has on caspase-3 activity. This is also shown in Figure 6B, where Vmax of caspase-3 
activity is shown derived from the data represented in Figure 6A. Vmax in the presence of 
the mutant granzyme B is approximately zero, wherein the wild-type only halves the Vmax 
relative to control. 

The mutant granzyme B was also effective in inhibiting caspase-3 activity and 
15 apoptosis in cell lysates containing caspase-3. In Figure 7 A, indicated amounts of 

I99A/N218A granzyme B was added to cell lysates and incubated for 18 hours. Caspase-3 
activity was then assayed by adding a fluorogenic substrate (Ac-DEVD-AMC) to a final 
concentration of 200 jiM. At low concentrations the mutant activates caspase-3 by cleaving 
at the activation sequence (SEQ ID NO:4), but at high concentrations it inhibits caspase-3 by 
20 cleaving at the inactivation sequence. Thus, I99A/N21 8 A granzyme B induces apoptosis at 
low concentrations but inhibits apoptosis at high concentrations. Figure 7A plots caspase-3 
activity against increasing concentrations of I99A/N218A mutant granzyme B. As the 
concentration of the mutant granzyme B was increased in cell lysates, the caspase-3 activity 
decreased. 

25 Apoptosis was induced in cell lysates by adding 100 nM of wildtype granzyme B, 

which activates caspase-3 by cleaving at the activation sequence, with or without the 
indicated amount of I99A/N218A granzyme B, and incubated for 18 hours. Caspase-3 
activity was assayed by cleavage of Ac-DEVD-AMC. Data was normalized for the 
background caspase-3 activity induced by the I99A/N2 1 8A granzyme B. 1 00 nM of wild- 

30 type granzyme B was added to cell extracts in all samples, with or without increasing 

concentrations of I99A/N218A granzyme B as indicated in Figure 7B. As shown in Figure 
7B, the mutant granzyme B antagonized the effect of wildtype granzyme B to induce 
apoptosis by inactivating caspase-3. Figure 7B shows a graph with the fraction of caspase-3 
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activity with varying concentrations of mutant granzyme B in the presence of 100 nM wild- 
type granzyme B. With increasing concentrations of mutant granzyme B, the caspase-3 
activity decreased below the level it was in the presence of wild-type granzyme B alone. 

EQUIVALENTS 

Although particular embodiments have been disclosed herein in detail, this has been 
done by way of example for purposes of illustration only, and is not intended to be limiting 
with respect to the scope of the appended claims, which follow. In particular, it is 
contemplated by the inventors that various substitutions, alterations, and modifications may 
be made to the invention without departing from the spirit and scope of the invention as 
defined by the claims. The choice of screening method, protease scaffold, or library type is 
believed to be a matter of routine for a person of ordinary skill in the art with knowledge of 
the embodiments described herein. Other aspects, advantages, and modifications considered 
to be within the scope of the following claims. 
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